What is the danger of the anomaly zone for empirical phylogenetics?
- PMID: 20525606
- DOI: 10.1093/sysbio/syp047
What is the danger of the anomaly zone for empirical phylogenetics?
Abstract
The increasing number of observations of gene trees with discordant topologies in phylogenetic studies has raised awareness about the problems of incongruence between species trees and gene trees. Moreover, theoretical treatments focusing on the impact of coalescent variance on phylogenetic study have also identified situations where the most probable gene trees are ones that do not match the underlying species tree (i.e., anomalous gene trees [AGTs]). However, although the theoretical proof of the existence of AGTs is alarming, the actual risk that AGTs pose to empirical phylogenetic study is far from clear. Establishing the conditions (i.e., the branch lengths in a species tree) for which AGTs are possible does not address the critical issue of how prevalent they might be. Furthermore, theoretical characterization of the species trees for which AGTs may pose a problem (i.e., the anomaly zone or the species histories for which AGTs are theoretically possible) is based on consideration of just one source of variance that contributes to species tree and gene tree discord-gene lineage coalescence. Yet, empirical data contain another important stochastic component-mutational variance. Estimated gene trees will differ from the underlying gene trees (i.e., the actual genealogy) because of the random process of mutation. Here, we take a simulation approach to investigate the prevalence of AGTs, among estimated gene trees, thereby characterizing the boundaries of the anomaly zone taking into account both coalescent and mutational variances. We also determine the frequency of realized AGTs, which is critical to putting the theoretical work on AGTs into a realistic biological context. Two salient results emerge from this investigation. First, our results show that mutational variance can indeed expand the parameter space (i.e., the relative branch lengths in a species tree) where AGTs might be observed in empirical data. By exploring the underlying cause for the expanded anomaly zone, we identify aspects of empirical data relevant to avoiding the problems that AGTs pose for species tree inference from multilocus data. Second, for the empirical species histories where AGTs are possible, unresolved trees-not AGTs-predominate the pool of estimated gene trees. This result suggests that the risk of AGTs, while they exist in theory, may rarely be realized in practice. By considering the biological realities of both mutational and coalescent variances, the study has refined, and redefined, what the actual challenges are for empirical phylogenetic study of recently diverged taxa that have speciated rapidly-AGTs themselves are unlikely to pose a significant danger to empirical phylogenetic study.
Similar articles
-
Maximum likelihood estimates of species trees: how accuracy of phylogenetic inference depends upon the divergence history and sampling design.Syst Biol. 2009 Oct;58(5):501-8. doi: 10.1093/sysbio/syp045. Epub 2009 Aug 20. Syst Biol. 2009. PMID: 20525604
-
Discordance of species trees with their most likely gene trees: the case of five taxa.Syst Biol. 2008 Feb;57(1):131-40. doi: 10.1080/10635150801905535. Syst Biol. 2008. PMID: 18300026
-
Sources of error inherent in species-tree estimation: impact of mutational and coalescent effects on accuracy and implications for choosing among different methods.Syst Biol. 2010 Oct;59(5):573-83. doi: 10.1093/sysbio/syq047. Epub 2010 Sep 10. Syst Biol. 2010. PMID: 20833951
-
Coalescent methods for estimating phylogenetic trees.Mol Phylogenet Evol. 2009 Oct;53(1):320-8. doi: 10.1016/j.ympev.2009.05.033. Epub 2009 Jun 6. Mol Phylogenet Evol. 2009. PMID: 19501178 Review.
-
Estimating phylogenetic trees from genome-scale data.Ann N Y Acad Sci. 2015 Dec;1360:36-53. doi: 10.1111/nyas.12747. Epub 2015 Apr 14. Ann N Y Acad Sci. 2015. PMID: 25873435 Review.
Cited by
-
Maximize Resolution or Minimize Error? Using Genotyping-By-Sequencing to Investigate the Recent Diversification of Helianthemum (Cistaceae).Front Plant Sci. 2019 Nov 11;10:1416. doi: 10.3389/fpls.2019.01416. eCollection 2019. Front Plant Sci. 2019. PMID: 31781140 Free PMC article.
-
Species delimitation using a combined coalescent and information-theoretic approach: an example from North American Myotis bats.Syst Biol. 2010 Jul;59(4):400-14. doi: 10.1093/sysbio/syq024. Epub 2010 May 24. Syst Biol. 2010. PMID: 20547777 Free PMC article.
-
Multispecies coalescent and its applications to infer species phylogenies and cross-species gene flow.Natl Sci Rev. 2021 Jul 15;8(12):nwab127. doi: 10.1093/nsr/nwab127. eCollection 2021 Dec. Natl Sci Rev. 2021. PMID: 34987842 Free PMC article.
-
An empirical evaluation of two-stage species tree inference strategies using a multilocus dataset from North American pines.BMC Evol Biol. 2014 Mar 29;14:67. doi: 10.1186/1471-2148-14-67. BMC Evol Biol. 2014. PMID: 24678701 Free PMC article.
-
Target-capture phylogenomics provide insights on gene and species tree discordances in Old World treefrogs (Anura: Rhacophoridae).Proc Biol Sci. 2020 Dec 9;287(1940):20202102. doi: 10.1098/rspb.2020.2102. Epub 2020 Dec 9. Proc Biol Sci. 2020. PMID: 33290680 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical