Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice
- PMID: 22962004
- PMCID: PMC3526802
- DOI: 10.1093/sysbio/sys078
Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice
Abstract
The presence of rogue taxa (rogues) in a set of trees can frequently have a negative impact on the results of a bootstrap analysis (e.g., the overall support in consensus trees). We introduce an efficient graph-based algorithm for rogue taxon identification as well as an interactive webservice implementing this algorithm. Compared with our previous method, the new algorithm is up to 4 orders of magnitude faster, while returning qualitatively identical results. Because of this significant improvement in scalability, the new algorithm can now identify substantially more complex and compute-intensive rogue taxon constellations. On a large and diverse collection of real-world data sets, we show that our method yields better supported reduced/pruned consensus trees than any competing rogue taxon identification method. Using the parallel version of our open-source code, we successfully identified rogue taxa in a set of 100 trees with 116 334 taxa each. For simulated data sets, we show that when removing/pruning rogue taxa with our method from a tree set, we consistently obtain bootstrap consensus trees as well as maximum-likelihood trees that are topologically closer to the respective true trees.
Figures



Similar articles
-
Uncovering hidden phylogenetic consensus in large data sets.IEEE/ACM Trans Comput Biol Bioinform. 2011 Jul-Aug;8(4):902-11. doi: 10.1109/TCBB.2011.28. IEEE/ACM Trans Comput Biol Bioinform. 2011. PMID: 21301032
-
Using Information Theory to Detect Rogue Taxa and Improve Consensus Trees.Syst Biol. 2022 Aug 10;71(5):1088-1094. doi: 10.1093/sysbio/syab099. Syst Biol. 2022. PMID: 34951650 Free PMC article.
-
Molecular phylogeny of the hyperdiverse genus Sarcophaga (Diptera: Sarcophagidae), and comparison between algorithms for identification of rogue taxa.Cladistics. 2017 Apr;33(2):109-133. doi: 10.1111/cla.12161. Epub 2016 Apr 7. Cladistics. 2017. PMID: 34710974
-
[A bird's eye view of the algorithms and software packages for reconstructing phylogenetic trees].Dongwuxue Yanjiu. 2013 Dec;34(6):640-50. Dongwuxue Yanjiu. 2013. PMID: 24415699 Review. Chinese.
-
Review Paper: The Shape of Phylogenetic Treespace.Syst Biol. 2017 Jan 1;66(1):e83-e94. doi: 10.1093/sysbio/syw025. Syst Biol. 2017. PMID: 28173538 Free PMC article. Review.
Cited by
-
Viridiplantae-specific GLXI and GLXII isoforms co-evolved and detoxify glucosone in planta.Plant Physiol. 2023 Feb 12;191(2):1214-1233. doi: 10.1093/plphys/kiac526. Plant Physiol. 2023. PMID: 36423222 Free PMC article.
-
Dispersal out of Wallacea spurs diversification of Pteropus flying foxes, the world's largest bats (Mammalia: Chiroptera).J Biogeogr. 2020 Feb;47(2):527-537. doi: 10.1111/jbi.13750. Epub 2019 Nov 21. J Biogeogr. 2020. PMID: 33041434 Free PMC article.
-
Asteroid: a new algorithm to infer species trees from gene trees under high proportions of missing data.Bioinformatics. 2023 Jan 1;39(1):btac832. doi: 10.1093/bioinformatics/btac832. Bioinformatics. 2023. PMID: 36576010 Free PMC article.
-
A tiny new Middle Triassic stem-lepidosauromorph from Germany: implications for the early evolution of lepidosauromorphs and the Vellberg fauna.Sci Rep. 2020 Feb 20;10(1):2273. doi: 10.1038/s41598-020-58883-x. Sci Rep. 2020. PMID: 32080209 Free PMC article.
-
Revisiting metazoan phylogeny with genomic sampling of all phyla.Proc Biol Sci. 2019 Jul 10;286(1906):20190831. doi: 10.1098/rspb.2019.0831. Epub 2019 Jul 10. Proc Biol Sci. 2019. PMID: 31288696 Free PMC article.
References
-
- Aberer A.J., Stamatakis A. A simple and accurate method for rogue taxon identification. IEEE International Conference on Bioinformatics and Biomedicine; Atlanta (GA). 2011. pp. 118–122. IEEE.
-
- Bryant D. A classification of consensus methods for phylogenetics. DIMACS Series in Discrete Mathematics and Theoretical Computer Science. 2003;61:163–184.
-
- Dunn C.W., Hejnol A., Matus D.Q., Pang K., Browne W.E., Smith S.A., Seaver E., Rouse G.W., Obst M., Edgecombe G.D., Sørensen M.V., Haddock S.H.D., Schmidt-Rhaesa A., Okusu A., Kristensen R.M., Wheeler W.C., Martindale M.Q., Giribet G. Broad phylogenomic sampling improves resolution of the animal tree of life. Nature. 2008;452:745–749. - PubMed
-
- Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985;39:783–791. - PubMed
-
- Maddison W., Maddison D. Mesquite: a modular system for evolutionary analysis. Evolution. 2008;62:1103–1118. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources