Maximum likelihood estimates of species trees: how accuracy of phylogenetic inference depends upon the divergence history and sampling design
- PMID: 20525604
- DOI: 10.1093/sysbio/syp045
Maximum likelihood estimates of species trees: how accuracy of phylogenetic inference depends upon the divergence history and sampling design
Abstract
The understanding that gene trees are often in discord with each other and with the species trees that contain them has led researchers to methods that incorporate the inherent stochasticity of genetic processes in the phylogenetic estimation procedure. Recently developed methods for species-tree estimation that not only consider the retention and sorting of ancestral polymorphism but also quantify the actual probabilities of incomplete lineage sorting are expected to provide an improvement over earlier summary-statistic based approaches that discard much of the information content of gene trees. However, these new methods have yet to be tested on truly challenging evolutionary histories such as those marked by recent rapid speciation where high levels of incomplete lineage sorting and discord among gene trees predominate. Here, we test a new maximum-likelihood method that incorporates stochastic models of both nucleotide substitution and lineage sorting for species-tree estimation. Using a simulation approach, we consider a broad range of species-tree topologies under 2 scenarios representing moderate and severe incomplete lineage sorting. We show that the maximum-likelihood method results in more accurate species trees than a summary-statistic based approach, demonstrating that information contained in discordant gene trees can be effectively extracted using a full probabilistic model. Moreover, we demonstrate that the shape of the original species tree (i.e., the relative lengths of internal branches) has a significant impact on whether the species tree is estimated accurately. In the speciation histories explored here, it is not just the recent origin of species that affects the accuracy of the estimates but the variance in relative species divergence times as well. Additionally, we show that sampling effort (number of individuals and/or loci) and sampling design (ratio of individuals to loci) are both important factors affecting the accuracy of species-tree estimates, which is again affected by the relative timing of divergence among species. The inherent difficulties of estimating relationships when species have undergone a recent radiation are discussed, and in particular, the limitations with maximum-likelihood estimates of species trees that do not consider uncertainty in the estimated gene trees of individual loci. Thus, despite substantial improvements over current summary-statistic based approaches, and the increased sophistication of procedures that incorporate the process of gene lineage coalescence, recent radiations still appear to pose daunting challenges for phylogenetics.
Similar articles
-
Sources of error inherent in species-tree estimation: impact of mutational and coalescent effects on accuracy and implications for choosing among different methods.Syst Biol. 2010 Oct;59(5):573-83. doi: 10.1093/sysbio/syq047. Epub 2010 Sep 10. Syst Biol. 2010. PMID: 20833951
-
Estimating species phylogeny from gene-tree probabilities despite incomplete lineage sorting: an example from Melanoplus grasshoppers.Syst Biol. 2007 Jun;56(3):400-11. doi: 10.1080/10635150701405560. Syst Biol. 2007. PMID: 17520504
-
What is the danger of the anomaly zone for empirical phylogenetics?Syst Biol. 2009 Oct;58(5):527-36. doi: 10.1093/sysbio/syp047. Epub 2009 Aug 26. Syst Biol. 2009. PMID: 20525606
-
Coalescent methods for estimating phylogenetic trees.Mol Phylogenet Evol. 2009 Oct;53(1):320-8. doi: 10.1016/j.ympev.2009.05.033. Epub 2009 Jun 6. Mol Phylogenet Evol. 2009. PMID: 19501178 Review.
-
Challenges in Species Tree Estimation Under the Multispecies Coalescent Model.Genetics. 2016 Dec;204(4):1353-1368. doi: 10.1534/genetics.116.190173. Genetics. 2016. PMID: 27927902 Free PMC article. Review.
Cited by
-
Phylogenomic analyses reveal reticulate evolution between Neomicrocalamus and Temochloa (Poaceae: Bambusoideae).Front Plant Sci. 2023 Dec 4;14:1274337. doi: 10.3389/fpls.2023.1274337. eCollection 2023. Front Plant Sci. 2023. PMID: 38111884 Free PMC article.
-
Fast and accurate methods for phylogenomic analyses.BMC Bioinformatics. 2011 Oct 5;12 Suppl 9(Suppl 9):S4. doi: 10.1186/1471-2105-12-S9-S4. BMC Bioinformatics. 2011. PMID: 22152123 Free PMC article.
-
QuCo: quartet-based co-estimation of species trees and gene trees.Bioinformatics. 2022 Jun 24;38(Suppl 1):i413-i421. doi: 10.1093/bioinformatics/btac265. Bioinformatics. 2022. PMID: 35758818 Free PMC article.
-
Comprehending Cornales: phylogenetic reconstruction of the order using the Angiosperms353 probe set.Am J Bot. 2021 Jul;108(7):1112-1121. doi: 10.1002/ajb2.1696. Epub 2021 Jul 14. Am J Bot. 2021. PMID: 34263456 Free PMC article.
-
PoMo: An Allele Frequency-Based Approach for Species Tree Estimation.Syst Biol. 2015 Nov;64(6):1018-31. doi: 10.1093/sysbio/syv048. Epub 2015 Jul 23. Syst Biol. 2015. PMID: 26209413 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical