Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Jan 29:8:2.
doi: 10.1186/1471-2156-8-2.

Empirical vs Bayesian approach for estimating haplotypes from genotypes of unrelated individuals

Affiliations

Empirical vs Bayesian approach for estimating haplotypes from genotypes of unrelated individuals

Shuying Sue Li et al. BMC Genet. .

Abstract

Background: The completion of the HapMap project has stimulated further development of haplotype-based methodologies for disease associations. A key aspect of such development is the statistical inference of individual diplotypes from unphased genotypes. Several methodologies for inferring haplotypes have been developed, but they have not been evaluated extensively to determine which method not only performs well, but also can be easily incorporated in downstream haplotype-based association analyses. In this paper, we attempt to do so. Our evaluation was carried out by comparing the two leading Bayesian methods, implemented in PHASE and HAPLOTYPER, and the two leading empirical methods, implemented in PL-EM and HPlus. We used these methods to analyze real data, namely the dense genotypes on X-chromosome of 30 European and 30 African trios provided by the International HapMap Project, and simulated genotype data. Our conclusions are based on these analyses.

Results: All programs performed very well on X-chromosome data, with an average similarity index of 0.99 and an average prediction rate of 0.99 for both European and African trios. On simulated data with approximation of coalescence, PHASE implementing the Bayesian method based on the coalescence approximation outperformed other programs on small sample sizes. When the sample size increased, other programs performed as well as PHASE. PL-EM and HPlus implementing empirical methods required much less running time than the programs implementing the Bayesian methods. They required only one hundredth or thousandth of the running time required by PHASE, particularly when analyzing large sample sizes and large umber of SNPs.

Conclusion: For large sample sizes (hundreds or more), which most association studies require, the two empirical methods might be used since they infer the haplotypes as accurately as any Bayesian methods and can be incorporated easily into downstream haplotype-based analyses such as haplotype-association analyses.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The relationship between the performances of haplotyping methods and the percentage of individuals with uncertainty haplotypes. The plots illustrate for the performances (in similarity Index and Prediction Rate) of empirical methods (PL-EM and HPlus) and Bayesian methods (PHASE and HAPLOTYPER) on analyzing the 500 randomly selected haplotype blocks of the 30 European mothers' genotypes on X-chromosome.
Figure 2
Figure 2
The relationship between the performances of haplotyping methods and the linkage disequilibrium (LD) of the haplotypes within blocks. The plots illustrate for the performances (in similarity Index and Prediction Rate) of empirical methods (PL-EM and HPlus) and Bayesian methods (PHASE and HAPLOTYPER) on analyzing the 500 randomly selected haplotype blocks of the 30 European mothers' genotypes on X-chromosome.

References

    1. The International HapMap Consortium A haplotype map of the human genome. Nature. 2005;437:1299–1320. doi: 10.1038/nature04226. - DOI - PMC - PubMed
    1. Reich DE, Cargill M, Bolk S, Ireland J, Sabeti PC, Richter DJ, Lavery T, Kouyoumjian R, Farhadian SF, Ward R, Lander ES. Linkage disequilibrium in the human genome. Nature. 2001;411:199–204. doi: 10.1038/35075590. - DOI - PubMed
    1. Daly MJ, Rioux JD, Schaffner SF, Hudson TJ, Lander ES. High-resolution haplotype structure in the human genome. Nature Genetics. 2001;29:229–232. doi: 10.1038/ng1001-229. - DOI - PubMed
    1. Goldstein DB. Islands of linkage disequilibrium. Nature Genetics. 2001;29:109–111. doi: 10.1038/ng1001-109. - DOI - PubMed
    1. Patil N, Berno AJ, Hinds DA, Barrett WA, Doshi JM, Hacker CR, Kautzer CR, Lee DH, Marjoribanks C, McDonough DP, Nguyen BT, Norris MC, Sheehan JB, Shen N, Stern D, Stokowski RP, Thomas DJ, Trulson MO, Vyas KR, Frazer KA, Fodor SP, Cox DR. Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science. 2001;294:1719–1723. doi: 10.1126/science.1065573. - DOI - PubMed

Publication types

LinkOut - more resources