trioPhaser: using Mendelian inheritance logic to improve genomic phasing of trios
- PMID: 34809557
- PMCID: PMC8607709
- DOI: 10.1186/s12859-021-04470-4
trioPhaser: using Mendelian inheritance logic to improve genomic phasing of trios
Abstract
Background: When analyzing DNA sequence data of an individual, knowing which nucleotide was inherited from each parent can be beneficial when trying to identify certain types of DNA variants. Mendelian inheritance logic can be used to accurately phase (haplotype) the majority (67-83%) of an individual's heterozygous nucleotide positions when genotypes are available for both parents (trio). However, when all members of a trio are heterozygous at a position, Mendelian inheritance logic cannot be used to phase. For such positions, a computational phasing algorithm can be used. Existing phasing algorithms use a haplotype reference panel, sequencing reads, and/or parental genotypes to phase an individual; however, they are limited in that they can only phase certain types of variants, require a specific genotype build, require large amounts of storage capacity, and/or require long run times. We created trioPhaser to address these challenges.
Results: trioPhaser uses gVCF files from an individual and their parents as initial input, and then outputs a phased VCF file. Input trio data are first phased using Mendelian inheritance logic. Then, the positions that cannot be phased using inheritance information alone are phased by the SHAPEIT4 phasing algorithm. Using whole-genome sequencing data of 52 trios, we show that trioPhaser, on average, increases the total number of phased positions by 21.0% and 10.5%, respectively, when compared to the number of positions that SHAPEIT4 or Mendelian inheritance logic can phase when either is used alone. In addition, we show that the accuracy of the phased calls output by trioPhaser are similar to linked-read and read-backed phasing.
Conclusion: trioPhaser is a containerized software tool that uses both Mendelian inheritance logic and SHAPEIT4 to phase trios when gVCF files are available. By implementing both phasing methods, more variant positions are phased compared to what either method is able to phase alone.
Keywords: Genomics; Haplotyping; Next-generation sequencing; Phasing; Trios.
© 2021. The Author(s).
Conflict of interest statement
The authors declare that they have no competing interests.
Figures
Similar articles
-
Benchmarking phasing software with a whole-genome sequenced cattle pedigree.BMC Genomics. 2022 Feb 15;23(1):130. doi: 10.1186/s12864-022-08354-6. BMC Genomics. 2022. PMID: 35164677 Free PMC article.
-
Read-based phasing of related individuals.Bioinformatics. 2016 Jun 15;32(12):i234-i242. doi: 10.1093/bioinformatics/btw276. Bioinformatics. 2016. PMID: 27307622 Free PMC article.
-
Genotype calling and phasing using next-generation sequencing reads and a haplotype scaffold.Bioinformatics. 2013 Jan 1;29(1):84-91. doi: 10.1093/bioinformatics/bts632. Epub 2012 Oct 23. Bioinformatics. 2013. PMID: 23093610
-
Haplotyping-Assisted Diploid Assembly and Variant Detection with Linked Reads.Methods Mol Biol. 2023;2590:161-182. doi: 10.1007/978-1-0716-2819-5_11. Methods Mol Biol. 2023. PMID: 36335499 Review.
-
Towards accurate, contiguous and complete alignment-based polyploid phasing algorithms.Genomics. 2022 May;114(3):110369. doi: 10.1016/j.ygeno.2022.110369. Epub 2022 Apr 26. Genomics. 2022. PMID: 35483655 Review.
Cited by
-
haploMAGIC: accurate phasing and detection of recombination in multiparental populations despite genotyping errors.G3 (Bethesda). 2024 Aug 7;14(8):jkae109. doi: 10.1093/g3journal/jkae109. G3 (Bethesda). 2024. PMID: 38808682 Free PMC article.
-
Estimating Gene Conversion Tract Length and Rate From PacBio HiFi Data.Mol Biol Evol. 2025 Feb 3;42(2):msaf019. doi: 10.1093/molbev/msaf019. Mol Biol Evol. 2025. PMID: 39982809 Free PMC article.
-
A novel GATA2 distal enhancer mutation results in MonoMAC syndrome in 2 second cousins.Blood Adv. 2023 Oct 24;7(20):6351-6363. doi: 10.1182/bloodadvances.2023010458. Blood Adv. 2023. PMID: 37595058 Free PMC article.
-
Using existing pediatric cancer data from the Gabriella Miller Kids First Data Resource Program.JNCI Cancer Spectr. 2023 Oct 31;7(6):pkad079. doi: 10.1093/jncics/pkad079. JNCI Cancer Spectr. 2023. PMID: 37788089 Free PMC article.
References
MeSH terms
LinkOut - more resources
Full Text Sources