Marginal Likelihoods in Phylogenetics: A Review of Methods and Applications

doi:10.1093/sysbio/syz003

Review

. 2019 Sep 1;68(5):681-697.

doi: 10.1093/sysbio/syz003.

Marginal Likelihoods in Phylogenetics: A Review of Methods and Applications

Jamie R Oaks¹, Kerry A Cobb¹, Vladimir N Minin², Adam D Leaché³

Affiliations

¹ Department of Biological Sciences and Museum of Natural History, Auburn University, Auburn, AL 36849, USA.
² Department of Statistics, University of California, Irvine, CA 92697, USA.
³ Department of Biology and Burke Museum of Natural History and Culture, University of Washington, Seattle, WA 98195, USA.

PMID: 30668834
PMCID: PMC6701458
DOI: 10.1093/sysbio/syz003

Review

Marginal Likelihoods in Phylogenetics: A Review of Methods and Applications

Jamie R Oaks et al. Syst Biol. 2019.

. 2019 Sep 1;68(5):681-697.

doi: 10.1093/sysbio/syz003.

Authors

Jamie R Oaks¹, Kerry A Cobb¹, Vladimir N Minin², Adam D Leaché³

Affiliations

¹ Department of Biological Sciences and Museum of Natural History, Auburn University, Auburn, AL 36849, USA.
² Department of Statistics, University of California, Irvine, CA 92697, USA.
³ Department of Biology and Burke Museum of Natural History and Culture, University of Washington, Seattle, WA 98195, USA.

PMID: 30668834
PMCID: PMC6701458
DOI: 10.1093/sysbio/syz003

Abstract

By providing a framework of accounting for the shared ancestry inherent to all life, phylogenetics is becoming the statistical foundation of biology. The importance of model choice continues to grow as phylogenetic models continue to increase in complexity to better capture micro- and macroevolutionary processes. In a Bayesian framework, the marginal likelihood is how data update our prior beliefs about models, which gives us an intuitive measure of comparing model fit that is grounded in probability theory. Given the rapid increase in the number and complexity of phylogenetic models, methods for approximating marginal likelihoods are increasingly important. Here, we try to provide an intuitive description of marginal likelihoods and why they are important in Bayesian model testing. We also categorize and review methods for estimating marginal likelihoods of phylogenetic models, highlighting several recent methods that provide well-behaved estimates. Furthermore, we review some empirical studies that demonstrate how marginal likelihoods can be used to learn about models of evolution from biological data. We discuss promising alternatives that can complement marginal likelihoods for Bayesian model choice, including posterior-predictive methods. Using simulations, we find one alternative method based on approximate-Bayesian computation to be biased. We conclude by discussing the challenges of Bayesian model choice and future directions that promise to improve the approximation of marginal likelihoods and Bayesian phylogenetics as a whole.

Keywords: Marginal likelihood; model choice; phylogenetics.

PubMed Disclaimer

Figures

**Figure 1.**
An illustration of the posterior probability densities and marginal likelihoods of the four different prior assumptions we made in our coin-flipping experiment. The data are 50 “heads” out of 100 coin flips, and the parameter, , is the probability of the coin landing heads side up. The binomial likelihood density function is proportional to a and is the same across the four different beta priors on (–). The posterior of each model is a distribution. The marginal likelihoods (; the average of the likelihood density curve weighted by the prior) of the four models are compared.

formula image — **Figure 1.**
An illustration of the posterior probability densities and marginal likelihoods of the four different prior assumptions we made in our coin-flipping experiment. The data are 50 “heads” out of 100 coin flips, and the parameter, , is the probability of the coin landing heads side up. The binomial likelihood density function is proportional to a and is the same across the four different beta priors on (–). The posterior of each model is a distribution. The marginal likelihoods (; the average of the likelihood density curve weighted by the prior) of the four models are compared.

**Figure 2.**
A comparison of the approximate-likelihood Bayesian computation general linear model (ABC-GLM) estimator of the marginal likelihood (Leuenberger and Wegmann 2010) to quadrature integration approximations (Xie et al. 2011) for 100 simulated data sets. We compared the ratio of the marginal likelihood (Bayes factor) comparing the correct branch-length model [branch length uniform(0.0001, 0.1)] to a model with a broader prior on the branch length [branch length uniform(0.0001, 0.2)]. The solid line represents perfect performance of the ABC-GLM estimator (i.e., matching the “true” value of the Bayes factor). The dashed line represents the expected Bayes factor when failing to penalize for the extra parameter space (branch length 0.1 to 0.2) with essentially zero likelihood. Quadrature integration with 1000 and 10,000 steps using the rectangular and trapezoidal rule produced identical values of log marginal likelihoods to at least five decimal places for all 100 simulated data sets.

**Figure A.1.**
A comparison of the true branch length separating each pair of simulated sequences to the branch length estimated by ABC-GLM and full-likelihood MCMC under the correct branch-length model (branch length uniform(0.0001, 0.1)) and the vague model (branch length uniform(0.0001, 0.1)).

See this image and copyright information in PMC

Cited by

Evolutionary rate of SARS-CoV-2 increases during zoonotic infection of farmed mink.
Porter AF, Purcell DFJ, Howden BP, Duchene S. Porter AF, et al. Virus Evol. 2023 Jan 10;9(1):vead002. doi: 10.1093/ve/vead002. eCollection 2023. Virus Evol. 2023. PMID: 36751428 Free PMC article.
Optimizing representations for integrative structural modeling using Bayesian model selection.
Arvindekar S, Pathak AS, Majila K, Viswanath S. Arvindekar S, et al. Bioinformatics. 2024 Mar 4;40(3):btae106. doi: 10.1093/bioinformatics/btae106. Bioinformatics. 2024. PMID: 38391029 Free PMC article.
Estimating effective population size changes from preferentially sampled genetic sequences.
Karcher MD, Carvalho LM, Suchard MA, Dudas G, Minin VN. Karcher MD, et al. PLoS Comput Biol. 2020 Oct 12;16(10):e1007774. doi: 10.1371/journal.pcbi.1007774. eCollection 2020 Oct. PLoS Comput Biol. 2020. PMID: 33044955 Free PMC article.
The comparative biogeography of Philippine geckos challenges predictions from a paradigm of climate-driven vicariant diversification across an island archipelago.
Oaks JR, Siler CD, Brown RM. Oaks JR, et al. Evolution. 2019 Jun;73(6):1151-1167. doi: 10.1111/evo.13754. Epub 2019 May 9. Evolution. 2019. PMID: 31017301 Free PMC article.
Under pressure: phenotypic divergence and convergence associated with microhabitat adaptations in Triatominae.
Abad-Franch F, Monteiro FA, Pavan MG, Patterson JS, Bargues MD, Zuriaga MÁ, Aguilar M, Beard CB, Mas-Coma S, Miles MA. Abad-Franch F, et al. Parasit Vectors. 2021 Apr 8;14(1):195. doi: 10.1186/s13071-021-04647-z. Parasit Vectors. 2021. PMID: 33832518 Free PMC article.

See all "Cited by" articles

References

1. Akaike H. 1974. A new look at the statistical model identification. IEEE Trans. Automat. Contr. 19:716–723.
1. Arima S., Tardella L.. 2012. Improved harmonic mean estimator for phylogenetic model evidence. J. Comput. Biol. 19:418–438. - PubMed
1. Arima S., Tardella L.. 2014. Inflated density ratio (IDR) method for estimating marginal likelihoods in Bayesian phylogenetics. In: Chen M.-H., Kuo L., Lewis P.O., editors. Bayesian phylogenetics: methods, algorithms, and applications, Chapter 3 Boca Raton (FL): CRC Press; p. 25–57.
1. Baele G., Lemey P.. 2013. Bayesian evolutionary model testing in the phylogenomics era: matching model complexity with computational efficiency. Bioinformatics. 29:1970–1979. - PubMed
1. Baele G., Lemey P.. 2014. Bayesian model selection in phylogenetics and genealogy-based population genetics. In: Chen M.-H., Kuo L., Lewis P.O., editors. Bayesian phylogenetics: methods, algorithms, and applications, Chapter 4. Boca Raton (FL): CRC Press; p. 59–93.

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

[1] Akaike H. 1974. A new look at the statistical model identification. IEEE Trans. Automat. Contr. 19:716–723.

[2] Akaike H. 1974. A new look at the statistical model identification. IEEE Trans. Automat. Contr. 19:716–723.

[3] Arima S., Tardella L.. 2012. Improved harmonic mean estimator for phylogenetic model evidence. J. Comput. Biol. 19:418–438. - PubMed

[4] Arima S., Tardella L.. 2012. Improved harmonic mean estimator for phylogenetic model evidence. J. Comput. Biol. 19:418–438. - PubMed

[5] Arima S., Tardella L.. 2014. Inflated density ratio (IDR) method for estimating marginal likelihoods in Bayesian phylogenetics. In: Chen M.-H., Kuo L., Lewis P.O., editors. Bayesian phylogenetics: methods, algorithms, and applications, Chapter 3 Boca Raton (FL): CRC Press; p. 25–57.

[6] Arima S., Tardella L.. 2014. Inflated density ratio (IDR) method for estimating marginal likelihoods in Bayesian phylogenetics. In: Chen M.-H., Kuo L., Lewis P.O., editors. Bayesian phylogenetics: methods, algorithms, and applications, Chapter 3 Boca Raton (FL): CRC Press; p. 25–57.

[7] Baele G., Lemey P.. 2013. Bayesian evolutionary model testing in the phylogenomics era: matching model complexity with computational efficiency. Bioinformatics. 29:1970–1979. - PubMed

[8] Baele G., Lemey P.. 2013. Bayesian evolutionary model testing in the phylogenomics era: matching model complexity with computational efficiency. Bioinformatics. 29:1970–1979. - PubMed

[9] Baele G., Lemey P.. 2014. Bayesian model selection in phylogenetics and genealogy-based population genetics. In: Chen M.-H., Kuo L., Lewis P.O., editors. Bayesian phylogenetics: methods, algorithms, and applications, Chapter 4. Boca Raton (FL): CRC Press; p. 59–93.

[10] Baele G., Lemey P.. 2014. Bayesian model selection in phylogenetics and genealogy-based population genetics. In: Chen M.-H., Kuo L., Lewis P.O., editors. Bayesian phylogenetics: methods, algorithms, and applications, Chapter 4. Boca Raton (FL): CRC Press; p. 59–93.

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Marginal Likelihoods in Phylogenetics: A Review of Methods and Applications

Affiliations

Marginal Likelihoods in Phylogenetics: A Review of Methods and Applications

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources