Cryptic failure of partitioned Bayesian phylogenetic analyses: lost in the land of long trees
- PMID: 20525623
- DOI: 10.1093/sysbio/syp080
Cryptic failure of partitioned Bayesian phylogenetic analyses: lost in the land of long trees
Abstract
Partitioned Bayesian phylogenetic analyses of routine genetic data sets, constructed using MrBayes (Ronquist and Huelsenbeck 2003), can become trapped in regions of parameter space characterized by unrealistically long trees and distorted partition rate multipliers. Such analyses commonly fail to reach stationarity during hundreds of millions of generations of sampling-many times longer than most published analyses. Some data sets are so prone to this problem that paired MrBayes runs begun from different starting trees repeatedly find the same incorrect long-tree solutions and consequently pass the most commonly employed tests of stationarity, including the average standard deviation of split frequencies (ASDSF) and the potential scale reduction factor (PSRF) statistics offered by MrBayes (Gelman and Rubin 1992). In these situations, failure to reach stationarity is recognizable only in light of prior knowledge of model parameters, such as the expectation that third-codon-position sites usually evolve fastest in protein-coding genes. The conditions that lead to the long-tree problem are frequently encountered in phylogenetic studies today, and I present 6 demonstration examples from the literature. Although the effects on tree length (TL) are often dramatic, effects on topology appear to be subtle. Susceptibility to the problem is sometimes predicted by the difference between the true TL and the starting TL. In some cases, the problems described here can be avoided or reduced by manipulation of the starting TL and/or by adjustments to the prior on branch lengths. In more difficult situations, accurate branch length estimation may not be possible with Bayesian methods because of dependence of the solution on the branch length prior.
Similar articles
-
When trees grow too long: investigating the causes of highly inaccurate bayesian branch-length estimates.Syst Biol. 2010 Mar;59(2):145-61. doi: 10.1093/sysbio/syp081. Epub 2009 Dec 10. Syst Biol. 2010. PMID: 20525627
-
The devil in the details: interactions between the branch-length prior and likelihood model affect node support and branch lengths in the phylogeny of the Psoraceae.Syst Biol. 2011 Jul;60(4):541-61. doi: 10.1093/sysbio/syr022. Epub 2011 Mar 24. Syst Biol. 2011. PMID: 21436107
-
Searching for convergence in phylogenetic Markov chain Monte Carlo.Syst Biol. 2006 Aug;55(4):553-65. doi: 10.1080/10635150600812544. Syst Biol. 2006. PMID: 16857650
-
Bayesian tests of topology hypotheses with an example from diving beetles.Syst Biol. 2013 Sep;62(5):660-73. doi: 10.1093/sysbio/syt029. Epub 2013 Apr 28. Syst Biol. 2013. PMID: 23628960 Free PMC article. Review.
-
Statistics for phylogenetic trees.Theor Popul Biol. 2003 Feb;63(1):17-32. doi: 10.1016/s0040-5809(02)00005-9. Theor Popul Biol. 2003. PMID: 12464492 Review.
Cited by
-
The phylogeography of Indoplanorbis exustus (Gastropoda: Planorbidae) in Asia.Parasit Vectors. 2010 Jul 5;3:57. doi: 10.1186/1756-3305-3-57. Parasit Vectors. 2010. PMID: 20602771 Free PMC article.
-
Posterior predictive Bayesian phylogenetic model selection.Syst Biol. 2014 May;63(3):309-21. doi: 10.1093/sysbio/syt068. Epub 2013 Nov 4. Syst Biol. 2014. PMID: 24193892 Free PMC article.
-
Data partitions, Bayesian analysis and phylogeny of the zygomycetous fungal family Mortierellaceae, inferred from nuclear ribosomal DNA sequences.PLoS One. 2011;6(11):e27507. doi: 10.1371/journal.pone.0027507. Epub 2011 Nov 10. PLoS One. 2011. PMID: 22102902 Free PMC article.
-
The rise of army ants and their relatives: diversification of specialized predatory doryline ants.BMC Evol Biol. 2014 May 1;14:93. doi: 10.1186/1471-2148-14-93. BMC Evol Biol. 2014. PMID: 24886136 Free PMC article.
-
EmpPrior: using outside empirical data to inform branch-length priors for Bayesian phylogenetics.BMC Bioinformatics. 2016 Jun 24;17:253. doi: 10.1186/s12859-016-1132-4. BMC Bioinformatics. 2016. PMID: 27342194 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources