Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun;28(6):790-803.
doi: 10.1038/s41431-020-0574-3. Epub 2020 Jan 29.

Genotype phasing in pedigrees using whole-genome sequence data

Affiliations

Genotype phasing in pedigrees using whole-genome sequence data

August N Blackburn et al. Eur J Hum Genet. 2020 Jun.

Abstract

Phasing is the process of inferring haplotypes from genotype data. Efficient algorithms and associated software for accurate phasing in pedigrees are needed, especially for populations lacking reference panels of sequenced individuals. We present a novel method for phasing genotypes from whole-genome sequence data in pedigrees, called PULSAR (Phasing Using Lineage Specific Alleles/Rare variants). The method is based on the property that alleles specific to a single founding chromosome within a pedigree are highly informative for identifying haplotypes that are shared identical by descent. Simulation studies are used to assess the performance of PULSAR with various pedigree sizes and structures, and the effect of genotyping errors and the presence of nonsequenced individuals is investigated. In pedigrees with complete sequencing and realistic genotyping error rates, PULSAR correctly phases >99.9% of heterozygous genotypes, excluding sites at which all individuals are heterozygous, and does so with a switch error rate frequently below 10-4. PULSAR is highly accurate, capable of genotype error correction and imputation, and computationally competitive with alternative phasing software applicable to pedigrees. Our method has the significant advantage of not requiring reference panels that are essential for other population-based phasing algorithms. A software implementation of PULSAR is freely available.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

Fig. 1
Fig. 1. Inheritance patterns of lineage-specific alleles.
a A pattern of individuals who share an allele (red) but do not share a common founder (within the limits of the available pedigree information). The shared allele cannot therefore be lineage specific. b A pedigree in which the individuals sharing an allele (orange) have two founders in common. The shared allele is potentially lineage specific. [Legend: ■ male; ● female; slash indicates an unsequenced individual].
Fig. 2
Fig. 2. Effect of genotyping accuracy and IBD sharing (number of sibs) on the switch error rate (the proportion of adjacent heterozygous genotypes correctly phased).
SER of zero indicates perfect phasing, SER of one indicates no adjacent heterozygous genotypes were correctly phased. Results are based on 20 simulations in nuclear families having 1–7 children.
Fig. 3
Fig. 3. Effect of IBD sharing on the accuracy of genotype reconstruction according to the haplotypes generated by PULSAR.
Results are based on 20 simulations in nuclear families having 1–7 children.

References

    1. Tewhey R, Bansal V, Torkamani A, Topol EJ, Schork NJ. The importance of phase information for human genomics. Nat Rev Genet. 2011;12:215–23. doi: 10.1038/nrg2950. - DOI - PMC - PubMed
    1. Browning SR, Browning BL. Haplotype phasing: existing methods and new developments. Nat Rev Genet. 2011;12:703–14. doi: 10.1038/nrg3054. - DOI - PMC - PubMed
    1. Ramstetter MD, Shenoy SA, Dyer TD, Lehman DM, Curran JE, Duggirala R, et al. Inferring identical-by-descent sharing of sample ancestors promotes high-resolution relative detection. Am J Hum Genet. 2018;103:30–44. doi: 10.1016/j.ajhg.2018.05.008. - DOI - PMC - PubMed
    1. Kong A, Masson G, Frigge ML, Gylfason A, Zusmanovich P, Thorleifsson G, et al. Detection of sharing by descent, long-range phasing and haplotype imputation. Nat Genet. 2008;40:1068–75. doi: 10.1038/ng.216. - DOI - PMC - PubMed
    1. Mitchell BD, Kammerer CM, Blangero J, Mahaney MC, Rainwater DL, Dyke B, et al. Genetic and environmental contributions to cardiovascular risk factors in Mexican Americans. The San Antonio Family Heart Study. Circulation. 1996;94:2159–70. doi: 10.1161/01.CIR.94.9.2159. - DOI - PubMed

Publication types