Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jan 1;62(1):162-6.
doi: 10.1093/sysbio/sys078. Epub 2012 Sep 6.

Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice

Affiliations

Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice

Andre J Aberer et al. Syst Biol. .

Abstract

The presence of rogue taxa (rogues) in a set of trees can frequently have a negative impact on the results of a bootstrap analysis (e.g., the overall support in consensus trees). We introduce an efficient graph-based algorithm for rogue taxon identification as well as an interactive webservice implementing this algorithm. Compared with our previous method, the new algorithm is up to 4 orders of magnitude faster, while returning qualitatively identical results. Because of this significant improvement in scalability, the new algorithm can now identify substantially more complex and compute-intensive rogue taxon constellations. On a large and diverse collection of real-world data sets, we show that our method yields better supported reduced/pruned consensus trees than any competing rogue taxon identification method. Using the parallel version of our open-source code, we successfully identified rogue taxa in a set of 100 trees with 116 334 taxa each. For simulated data sets, we show that when removing/pruning rogue taxa with our method from a tree set, we consistently obtain bootstrap consensus trees as well as maximum-likelihood trees that are topologically closer to the respective true trees.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Runtimes for the STA, BMA, and RNR algorithm with maximum dropset size l:= 1 and l:= 2. x-axis refers to the initial number of bipartitions |ℬ| for a bootstrap tree collection. Runtimes for MRC as consensus threshold (SC similar).
Figure 2
Figure 2
Support improvement (in %) for optimization with a MRC threshold. RNR-l depicts RNR runs with l ∈ [1,3], BMA-mod is a less conservative modification of the BMA.
Figure 3
Figure 3
Support improvement (in %) for optimization with a SC threshold. RNR-l depicts RNR runs with l ∈ [1,3], BMA-mod is a less conservative modification of the BMA.

Similar articles

Cited by

References

    1. Aberer A.J., Stamatakis A. A simple and accurate method for rogue taxon identification. IEEE International Conference on Bioinformatics and Biomedicine; Atlanta (GA). 2011. pp. 118–122. IEEE.
    1. Bryant D. A classification of consensus methods for phylogenetics. DIMACS Series in Discrete Mathematics and Theoretical Computer Science. 2003;61:163–184.
    1. Dunn C.W., Hejnol A., Matus D.Q., Pang K., Browne W.E., Smith S.A., Seaver E., Rouse G.W., Obst M., Edgecombe G.D., Sørensen M.V., Haddock S.H.D., Schmidt-Rhaesa A., Okusu A., Kristensen R.M., Wheeler W.C., Martindale M.Q., Giribet G. Broad phylogenomic sampling improves resolution of the animal tree of life. Nature. 2008;452:745–749. - PubMed
    1. Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985;39:783–791. - PubMed
    1. Maddison W., Maddison D. Mesquite: a modular system for evolutionary analysis. Evolution. 2008;62:1103–1118. - PubMed

Publication types