Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Nov 6;10(11):e1003913.
doi: 10.1371/journal.pcbi.1003913. eCollection 2014 Nov.

Inference of epidemiological dynamics based on simulated phylogenies using birth-death and coalescent models

Affiliations

Inference of epidemiological dynamics based on simulated phylogenies using birth-death and coalescent models

Veronika Boskova et al. PLoS Comput Biol. .

Abstract

Quantifying epidemiological dynamics is crucial for understanding and forecasting the spread of an epidemic. The coalescent and the birth-death model are used interchangeably to infer epidemiological parameters from the genealogical relationships of the pathogen population under study, which in turn are inferred from the pathogen genetic sequencing data. To compare the performance of these widely applied models, we performed a simulation study. We simulated phylogenetic trees under the constant rate birth-death model and the coalescent model with a deterministic exponentially growing infected population. For each tree, we re-estimated the epidemiological parameters using both a birth-death and a coalescent based method, implemented as an MCMC procedure in BEAST v2.0. In our analyses that estimate the growth rate of an epidemic based on simulated birth-death trees, the point estimates such as the maximum a posteriori/maximum likelihood estimates are not very different. However, the estimates of uncertainty are very different. The birth-death model had a higher coverage than the coalescent model, i.e. contained the true value in the highest posterior density (HPD) interval more often (2-13% vs. 31-75% error). The coverage of the coalescent decreases with decreasing basic reproductive ratio and increasing sampling probability of infecteds. We hypothesize that the biases in the coalescent are due to the assumption of deterministic rather than stochastic population size changes. Both methods performed reasonably well when analyzing trees simulated under the coalescent. The methods can also identify other key epidemiological parameters as long as one of the parameters is fixed to its true value. In summary, when using genetic data to estimate epidemic dynamics, our results suggest that the birth-death method will be less sensitive to population fluctuations of early outbreaks than the coalescent method that assumes a deterministic exponentially growing infected population.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Comparison of the birth-death model and the coalescent model in estimating epidemic growth rate.
For each plot, 100 trees simulated under the constant rate birth-death (BD) model with incomplete sampling (subfigure A) or coalescent (CE) model with exponential growth of the infected population (subfigure B) were analyzed assuming a birth-death model (blue bars) or a coalescent model with deterministic exponential population growth (red bars). 95% highest posterior density (HPD) intervals of the growth rate parameter are shown (y-axis). The trees are ordered (x-axis) by the median value of the posterior distribution of the growth rate parameter estimated by the coalescent (orange dot within the red bar) from the birth-death trees. Median of the posterior estimates for the growth rate parameter estimated by birth-death model is indicated as light blue dot within each blue interval. The true value of the growth rate parameter, i.e. the value under which the trees were simulated, is displayed as black horizontal bar. Here, we used formula image and formula image (formula image). See Figure S1 for the plots of other parameter settings.
Figure 2
Figure 2. Influence of branch length extension in various parts of the tree on the growth rate parameter estimation.
For setting formula image and formula image (formula image), we modified all 100 birth-death trees (A) and all 100 coalescent trees (B) by branch extension. The unchanged tree is denoted as “orig” on x-axis. We added 48 units of time, roughly corresponding to the full length of the longest trees, to the branches. We extended the branches that were present in the tree at 10% of the tree (going from the root), at 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% (see x-axis from left to right). We then re-estimated the growth rate parameter for each such tree. Unlike in previous plot, here we display a summary in form of the median values of the start and the end of the 95% HPD intervals, and the median of the medians of the posterior estimates for all 100 trees per setting.
Figure 3
Figure 3. Influence of sampling scheme on the growth rate parameter estimation.
For setting formula image (formula image), we modified the birth-death tree simulations to include periods of higher (formula image) and lower sampling (either formula image, subfigures A and B, or formula image, subfigures C and D). We simulated 100 birth-death trees (A and C) and corresponding coalescent trees (B and D) under various sampling schemes (see x-axis annotation). We display a summary in form of the median values of the start and the end of the 95% HPD intervals, and the median of the medians of the posterior estimates for all 100 trees per setting. For the settings where the constant rate birth-death method produced very severe biases, we also analysed the trees with the birth-death skyline model with 10 intervals for the sampling probability (BDSKY, light-blue lines). The summary for trees simulated under constant sampling formula image throughout, is represented on the very left of each figure (formula image on the x-axis). Next, we varied the sampling as to e.g. sample no tips (formula image) in the early phases (formula image until formula image) when going forward in time and then sampling all the tips that die (formula image) from formula image onward (corresponding to the setting denoted as “p = 0 from t = 0 to t = 9”).
Figure 4
Figure 4. Error on as a function of sampling probability for fixed and .
In (A) the relationship between the error on formula image, i.e. estimated formula image/true formula image, and the sampling probability formula image is plotted. The values formula image and formula image are fixed. For different formula image, formula image, and formula image and formula image, we calculate formula image and formula image, and plot the impact on formula image error when changing formula image during inference using Equation (3) in the Supplementary Material S1. In (B) we display how error on formula image depends on different assumptions of formula image during inference for formula image, and formula image and an array of true sampling probability formula image used for calculating formula image and formula image.
Figure 5
Figure 5. Effect of different information used in the parameter inference.
For setting formula image and formula image (formula image), we estimated the formula image parameter from the birth-death trees (A) and the coalescent trees (B) using four methods. First, using the coalescent posterior estimates of the growth rate formula image and the true formula image, we obtained formula image formula image formula image formula image (red bars). Second, we used the birth-death posterior estimates of formula image (trees analysed under uniform priors for formula image, formula image, and formula image), and the true formula image in the post-processing (blue bars), similar to the procedure used for the coalescent. Third, we also analyzed the trees by fixing the prior on the death rate formula image to the true value, formula image (green bars) or by fixing the prior on the sampling probability formula image to the true value, formula image (purple bars) during the MCMC analysis. Note that y-axis now displays 95% HPD of the formula image parameter, and within each figure, the trees (simulations) are ordered (x-axis) by the median estimate of growth rate formula image parameter estimated by the coalescent on the birth-death trees.
Figure 6
Figure 6. Comparison of the birth-death model and the coalescent model in estimating epidemic growth rate from trees with tips sampled at one point in time.
For simulated trees where all 100 tips are sampled at one point in time, we estimated the growth rate parameter assuming a birth-death model with fixed sampling probability formula image (blue bars) and the coalescent model with a deterministic exponentially growing population (red bars). Here we used formula image and sampling probability formula image (formula image). See Figure S15 for the plots of other parameter settings.
Figure 7
Figure 7. Comparison of growth rate point estimates of the birth-death model and the coalescent model.
For setting formula image and formula image (formula image), we display the ML and MAP estimates for the birth-death trees (A) and the coalescent trees (B). As a comparison, the median values of the start and the end of the 95% HPD intervals, and the median of the medians of the posterior estimates for all 100 trees per setting are also displayed. The true value of the growth rate parameter, i.e. the value under which the trees were simulated, is displayed as a black horizontal bar. See Figures S17 and S18 for the plots of other parameter settings.

References

    1. Anderson R, May R (1991) Infectious diseases of humans. Dynamics and Control Oxford University Press, Oxford, New York, Tokyo.
    1. Dietz K (1975) Transmission and control of arbovirus diseases. Epidemiology 104–121.
    1. Kühnert D, Wu CH, Drummond AJ (2011) Phylogenetic and epidemic modeling of rapidly evolving infectious diseases. Infection, Genetics and Evolution 11: 1825–1841. - PMC - PubMed
    1. Kermack W, McKendrick A (1927) A contribution to the mathematical theory of epidemics. Proceedings of the Royal Society of London Series A 115: 700–721.
    1. Felsenstein J (2004) Inferring phylogenies, volume 2. Sinauer Associates Sunderland

Publication types

LinkOut - more resources