Bayesian evolutionary model testing in the phylogenomics era: matching model complexity with computational efficiency
- PMID: 23766415
- DOI: 10.1093/bioinformatics/btt340
Bayesian evolutionary model testing in the phylogenomics era: matching model complexity with computational efficiency
Abstract
Motivation: The advent of new sequencing technologies has led to increasing amounts of data being available to perform phylogenetic analyses, with genomic data giving rise to the field of phylogenomics. High-performance computing is becoming an indispensable research tool to fit complex evolutionary models, which take into account specific genomic properties, to large datasets. Here, we perform an extensive Bayesian phylogenetic model selection study, comparing codon and nucleotide substitution models, including codon position partitioning for nucleotide data as well gene-specific substitution models for both data types. For the best fitting partitioned models, we also compare independent partitioning with standard diffuse prior specification to conditional partitioning via hierarchical prior specification. To compare the different models, we use state-of-the-art marginal likelihood estimation techniques, including path sampling and stepping-stone sampling.
Results: We show that a full codon model best describes the features of a whole mitochondrial genome dataset, consisting of 12 protein-coding genes, but only when each gene is allowed to evolve under a separate codon model. However, when using hierarchical prior specification for the partition-specific parameters instead of independent diffuse priors, codon position partitioned nucleotide models can still outperform standard codon models. We demonstrate the feasibility of fitting such a combination of complex models using the BEAGLE library for BEAST in combination with recent graphics cards. We argue that development and use of such models needs to be accompanied by state-of-the-art marginal likelihood estimators because the more traditional and computationally less demanding estimators do not offer adequate accuracy.
Similar articles
-
Make the most of your samples: Bayes factor estimators for high-dimensional models of sequence evolution.BMC Bioinformatics. 2013 Mar 6;14:85. doi: 10.1186/1471-2105-14-85. BMC Bioinformatics. 2013. PMID: 23497171 Free PMC article.
-
The devil in the details: interactions between the branch-length prior and likelihood model affect node support and branch lengths in the phylogeny of the Psoraceae.Syst Biol. 2011 Jul;60(4):541-61. doi: 10.1093/sysbio/syr022. Epub 2011 Mar 24. Syst Biol. 2011. PMID: 21436107
-
Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty.Syst Biol. 2016 Mar;65(2):250-64. doi: 10.1093/sysbio/syv083. Epub 2015 Nov 1. Syst Biol. 2016. PMID: 26526428 Free PMC article.
-
Many-core algorithms for statistical phylogenetics.Bioinformatics. 2009 Jun 1;25(11):1370-6. doi: 10.1093/bioinformatics/btp244. Epub 2009 Apr 15. Bioinformatics. 2009. PMID: 19369496 Free PMC article.
-
Next-generation development and application of codon model in evolution.Front Genet. 2023 Jan 27;14:1091575. doi: 10.3389/fgene.2023.1091575. eCollection 2023. Front Genet. 2023. PMID: 36777719 Free PMC article. Review.
Cited by
-
Phylogenetic Analysis of Multi-Drug Resistant Klebsiella pneumoniae Strains From Duodenoscope Biofilm: Microbiological Surveillance and Reprocessing Improvements for Infection Prevention.Front Public Health. 2019 Aug 6;7:219. doi: 10.3389/fpubh.2019.00219. eCollection 2019. Front Public Health. 2019. PMID: 31448253 Free PMC article.
-
Origin and evolutionary dynamics of Hepatitis B virus (HBV) genotype E in Madagascar.Pathog Glob Health. 2017 Feb;111(1):23-30. doi: 10.1080/20477724.2016.1278103. Epub 2017 Jan 12. Pathog Glob Health. 2017. PMID: 28081689 Free PMC article.
-
More on the Best Evolutionary Rate for Phylogenetic Analysis.Syst Biol. 2017 Sep 1;66(5):769-785. doi: 10.1093/sysbio/syx051. Syst Biol. 2017. PMID: 28595363 Free PMC article.
-
Aridification and major geotectonic landscape change shaped an extraordinary species radiation across a world's extreme elevational gradient.Commun Biol. 2024 Nov 13;7(1):1500. doi: 10.1038/s42003-024-07181-7. Commun Biol. 2024. PMID: 39538007 Free PMC article.
-
Bayesian Inference Reveals Host-Specific Contributions to the Epidemic Expansion of Influenza A H5N1.Mol Biol Evol. 2015 Dec;32(12):3264-75. doi: 10.1093/molbev/msv185. Epub 2015 Sep 3. Mol Biol Evol. 2015. PMID: 26341298 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources