Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Oct 10;377(1861):20210242.
doi: 10.1098/rstb.2021.0242. Epub 2022 Aug 22.

Scalable Bayesian phylogenetics

Affiliations
Review

Scalable Bayesian phylogenetics

Alexander A Fisher et al. Philos Trans R Soc Lond B Biol Sci. .

Abstract

Recent advances in Bayesian phylogenetics offer substantial computational savings to accommodate increased genomic sampling that challenges traditional inference methods. In this review, we begin with a brief summary of the Bayesian phylogenetic framework, and then conceptualize a variety of methods to improve posterior approximations via Markov chain Monte Carlo (MCMC) sampling. Specifically, we discuss methods to improve the speed of likelihood calculations, reduce MCMC burn-in, and generate better MCMC proposals. We apply several of these techniques to study the evolution of HIV virulence along a 1536-tip phylogeny and estimate the internal node heights of a 1000-tip SARS-CoV-2 phylogenetic tree in order to illustrate the speed-up of such analyses using current state-of-the-art approaches. We conclude our review with a discussion of promising alternatives to MCMC that approximate the phylogenetic posterior. This article is part of a discussion meeting issue 'Genomic population structures of microbial pathogens'.

Keywords: BEAST; Bayesian phylogenetics; Hamiltonian Monte Carlo; adapative MCMC; online inference; scalable inference.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The online addition of 132 SARS-CoV-2 sequences to a 588-tip time-measured tree drawn from the posterior. Appended branches are blue while original branches are black. We omit timescale since this augmented tree is not sampled from the posterior.
Figure 2.
Figure 2.
(a) Joint trajectory of two branch-specific clock rates γ1 and γ2 over their joint density in a three taxa tree with simulated sequence data. Trajectories display 600 posterior samples from the uMH chain and 100 posterior samples from the HMC chain since uMH takes six times as many steps when controlling for runtime. The strong posterior correlation between γ1 and γ2 results in very poor mixing with uMH while HMC easily accommodates. (b) Effective sample size (ESS) per second of BEAST runtime under both HMC and uMH MCMC samplers of branch-specific rates of phenotypic evolution over a 1536-tip HIV-1 tree. HMC results in a median speed-up of ×1000. (c) Trace plot of B.1.177 clade age with node heights sampled under both uMH MCMC and HMC.

Similar articles

Cited by

References

    1. Zhou Z, et al. 2019. The EnteroBase user’s guide, with case studies on Salmonella transmissions, Yersinia pestis phylogeny, and Escherichia core genomic diversity. Genome Res. 30, 138-152. ( 10.1101/gr.251678.119) - DOI - PMC - PubMed
    1. Dudas G, et al. 2017. Virus genomes reveal factors that spread and sustained the Ebola epidemic. Nature 544, 309-315. ( 10.1038/nature22040) - DOI - PMC - PubMed
    1. Quick J, et al. 2016. Real-time, portable genome sequencing for Ebola surveillance. Nature 530, 228-232. ( 10.1038/nature16996) - DOI - PMC - PubMed
    1. Elbe S, Buckland-Merrett G. 2017. Data, disease and diplomacy: GISAID’s innovative contribution to global health. Glob. Chall. 1, 33-46. ( 10.1002/gch2.1018) - DOI - PMC - PubMed
    1. Hadfield J, Megill C, Bell SM, Huddleston J, Potter B, Callender C, Sagulenko P, Bedford T, Neher RA. 2018. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics 34, 4121-4123. ( 10.1093/bioinformatics/bty407) - DOI - PMC - PubMed

Publication types