Understanding the accuracy of statistical haplotype inference with sequence data of known phase
- PMID: 17922479
- PMCID: PMC2291540
- DOI: 10.1002/gepi.20185
Understanding the accuracy of statistical haplotype inference with sequence data of known phase
Abstract
Statistical methods for haplotype inference from multi-site genotypes of unrelated individuals have important application in association studies and population genetics. Understanding the factors that affect the accuracy of this inference is important, but their assessment has been restricted by the limited availability of biological data with known phase. We created hybrid cell lines monosomic for human chromosome 19 and produced single-chromosome complete sequences of a 48 kb genomic region in 39 individuals of African American (AA) and European American (EA) origin. We employ these phase-known genotypes and coalescent simulations to assess the accuracy of statistical haplotype reconstruction by several algorithms. Accuracy of phase inference was considerably low in our biological data even for regions as short as 25-50 kb, suggesting that caution is needed when analyzing reconstructed haplotypes. Moreover, the reliability of estimated confidence in phase inference is not high enough to allow for a reliable incorporation of site-specific uncertainty information in subsequent analyses. We show that, in samples of certain mixed ancestry (AA and EA populations), the most accurate haplotypes are probably obtained when increasing sample size by considering the largest, pooled sample, despite the hypothetical problems associated with pooling across those heterogeneous samples. Strategies to improve confidence in reconstructed haplotypes, and realistic alternatives to the analysis of inferred haplotypes, are discussed.
Figures



Similar articles
-
Risk Haplotypes Uniquely Associated with Radioiodine-Refractory Thyroid Cancer Patients of High African Ancestry.Thyroid. 2019 Apr;29(4):530-539. doi: 10.1089/thy.2018.0687. Epub 2019 Feb 13. Thyroid. 2019. PMID: 30654714 Free PMC article.
-
Interaction between two independent CNR1 variants increases risk for cocaine dependence in European Americans: a replication study in family-based sample and population-based sample.Neuropsychopharmacology. 2009 May;34(6):1504-13. doi: 10.1038/npp.2008.206. Epub 2008 Dec 3. Neuropsychopharmacology. 2009. PMID: 19052543 Free PMC article.
-
Association between polymorphisms in catechol-O-methyltransferase (COMT) and cocaine-induced paranoia in European-American and African-American populations.Am J Med Genet B Neuropsychiatr Genet. 2011 Sep;156B(6):651-60. doi: 10.1002/ajmg.b.31205. Epub 2011 Jun 8. Am J Med Genet B Neuropsychiatr Genet. 2011. PMID: 21656904 Free PMC article.
-
Association of specific PTEN/10q haplotypes with endometrial cancer phenotypes in African-American and European American women.Gynecol Oncol. 2015 Aug;138(2):434-40. doi: 10.1016/j.ygyno.2015.05.024. Epub 2015 May 28. Gynecol Oncol. 2015. PMID: 26026735
-
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
Cited by
-
Multiple rare variants as a cause of a common phenotype: several different lactase persistence associated alleles in a single ethnic group.J Mol Evol. 2009 Dec;69(6):579-88. doi: 10.1007/s00239-009-9301-y. Epub 2009 Nov 24. J Mol Evol. 2009. PMID: 19937006
-
A groupwise association test for rare mutations using a weighted sum statistic.PLoS Genet. 2009 Feb;5(2):e1000384. doi: 10.1371/journal.pgen.1000384. Epub 2009 Feb 13. PLoS Genet. 2009. PMID: 19214210 Free PMC article.
-
APOE/C1/C4/C2 hepatic control region polymorphism influences plasma apoE and LDL cholesterol levels.Hum Mol Genet. 2008 Jul 1;17(13):2039-46. doi: 10.1093/hmg/ddn101. Epub 2008 Mar 31. Hum Mol Genet. 2008. PMID: 18378515 Free PMC article.
-
Testing an optimally weighted combination of common and/or rare variants with multiple traits.PLoS One. 2018 Jul 26;13(7):e0201186. doi: 10.1371/journal.pone.0201186. eCollection 2018. PLoS One. 2018. PMID: 30048520 Free PMC article.
-
Targets of balancing selection in the human genome.Mol Biol Evol. 2009 Dec;26(12):2755-64. doi: 10.1093/molbev/msp190. Epub 2009 Aug 27. Mol Biol Evol. 2009. PMID: 19713326 Free PMC article.
References
-
- Barrett JC, Fry B, Maller J, Daly MJ, Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005;21:263–265. - PubMed
-
- Chung RH, Gusfield D. Perfect phylogeny haplotyper: haplotype inferral using a tree model. Bioinformatics. 2003;19:780–781. - PubMed
-
- Clark AG. Inference of haplotypes from PCR-amplified samples of diploid populations. Mol Biol Evol. 1990;7:111–122. - PubMed
-
- Clark AG. The role of haplotypes in candidate gene studies. Genet Epidemiol. 2004;27:321–333. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases