Nuclear gene phylogeography using PHASE: dealing with unresolved genotypes, lost alleles, and systematic bias in parameter estimation
- PMID: 20429950
- PMCID: PMC2880299
- DOI: 10.1186/1471-2148-10-118
Nuclear gene phylogeography using PHASE: dealing with unresolved genotypes, lost alleles, and systematic bias in parameter estimation
Abstract
Background: A widely-used approach for screening nuclear DNA markers is to obtain sequence data and use bioinformatic algorithms to estimate which two alleles are present in heterozygous individuals. It is common practice to omit unresolved genotypes from downstream analyses, but the implications of this have not been investigated. We evaluated the haplotype reconstruction method implemented by PHASE in the context of phylogeographic applications. Empirical sequence datasets from five non-coding nuclear loci with gametic phase ascribed by molecular approaches were coupled with simulated datasets to investigate three key issues: (1) haplotype reconstruction error rates and the nature of inference errors, (2) dataset features and genotypic configurations that drive haplotype reconstruction uncertainty, and (3) impacts of omitting unresolved genotypes on levels of observed phylogenetic diversity and the accuracy of downstream phylogeographic analyses.
Results: We found that PHASE usually had very low false-positives (i.e., a low rate of confidently inferring haplotype pairs that were incorrect). The majority of genotypes that could not be resolved with high confidence included an allele occurring only once in a dataset, and genotypic configurations involving two low-frequency alleles were disproportionately represented in the pool of unresolved genotypes. The standard practice of omitting unresolved genotypes from downstream analyses can lead to considerable reductions in overall phylogenetic diversity that is skewed towards the loss of alleles with larger-than-average pairwise sequence divergences, and in turn, this causes systematic bias in estimates of important population genetic parameters.
Conclusions: A combination of experimental and computational approaches for resolving phase of segregating sites in phylogeographic applications is essential. We outline practical approaches to mitigating potential impacts of computational haplotype reconstruction on phylogeographic inferences. With targeted application of laboratory procedures that enable unambiguous phase determination via physical isolation of alleles from diploid PCR products, relatively little investment of time and effort is needed to overcome the observed biases.
Figures







Similar articles
-
Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data.Am J Hum Genet. 2000 Oct;67(4):947-59. doi: 10.1086/303069. Epub 2000 Aug 22. Am J Hum Genet. 2000. PMID: 10954684 Free PMC article.
-
Comparison of the accuracy of methods of computational haplotype inference using a large empirical dataset.BMC Genet. 2004 Aug 3;5:22. doi: 10.1186/1471-2156-5-22. BMC Genet. 2004. PMID: 15291970 Free PMC article.
-
Evaluation of computational methods for the reconstruction of HLA haplotypes.Tissue Antigens. 2010 Dec;76(6):459-66. doi: 10.1111/j.1399-0039.2010.01539.x. Tissue Antigens. 2010. PMID: 20670352
-
Perspective: gene divergence, population divergence, and the variance in coalescence time in phylogeographic studies.Evolution. 2000 Dec;54(6):1839-54. doi: 10.1111/j.0014-3820.2000.tb01231.x. Evolution. 2000. PMID: 11209764 Review.
-
Algorithms for inferring haplotypes.Genet Epidemiol. 2004 Dec;27(4):334-47. doi: 10.1002/gepi.20024. Genet Epidemiol. 2004. PMID: 15368348 Review.
Cited by
-
Embracing heterogeneity: coalescing the Tree of Life and the future of phylogenomics.PeerJ. 2019 Feb 14;7:e6399. doi: 10.7717/peerj.6399. eCollection 2019. PeerJ. 2019. PMID: 30783571 Free PMC article.
-
Extensive introgressive hybridization within the northern oriole group (Genus Icterus) revealed by three-species isolation with migration analysis.Ecol Evol. 2012 Oct;2(10):2413-29. doi: 10.1002/ece3.365. Epub 2012 Aug 29. Ecol Evol. 2012. PMID: 23145328 Free PMC article.
-
First steps towards assessing the evolutionary history and phylogeography of a widely distributed Neotropical grassland bird (Motacillidae: Anthus correndera).PeerJ. 2018 Nov 21;6:e5886. doi: 10.7717/peerj.5886. eCollection 2018. PeerJ. 2018. PMID: 30498628 Free PMC article.
-
Nonrecombining genes in a recombination environment: the Drosophila "dot" chromosome.Mol Biol Evol. 2011 Jan;28(1):825-33. doi: 10.1093/molbev/msq258. Epub 2010 Oct 12. Mol Biol Evol. 2011. PMID: 20940345 Free PMC article.
-
Bears in a forest of gene trees: phylogenetic inference is complicated by incomplete lineage sorting and gene flow.Mol Biol Evol. 2014 Aug;31(8):2004-17. doi: 10.1093/molbev/msu186. Epub 2014 Jun 5. Mol Biol Evol. 2014. PMID: 24903145 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources