Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Aug;7(8):e1002136.
doi: 10.1371/journal.pcbi.1002136. Epub 2011 Aug 25.

Inference for nonlinear epidemiological models using genealogies and time series

Affiliations

Inference for nonlinear epidemiological models using genealogies and time series

David A Rasmussen et al. PLoS Comput Biol. 2011 Aug.

Abstract

Phylodynamics - the field aiming to quantitatively integrate the ecological and evolutionary dynamics of rapidly evolving populations like those of RNA viruses - increasingly relies upon coalescent approaches to infer past population dynamics from reconstructed genealogies. As sequence data have become more abundant, these approaches are beginning to be used on populations undergoing rapid and rather complex dynamics. In such cases, the simple demographic models that current phylodynamic methods employ can be limiting. First, these models are not ideal for yielding biological insight into the processes that drive the dynamics of the populations of interest. Second, these models differ in form from mechanistic and often stochastic population dynamic models that are currently widely used when fitting models to time series data. As such, their use does not allow for both genealogical data and time series data to be considered in tandem when conducting inference. Here, we present a flexible statistical framework for phylodynamic inference that goes beyond these current limitations. The framework we present employs a recently developed method known as particle MCMC to fit stochastic, nonlinear mechanistic models for complex population dynamics to gene genealogies and time series data in a Bayesian framework. We demonstrate our approach using a nonlinear Susceptible-Infected-Recovered (SIR) model for the transmission dynamics of an infectious disease and show through simulations that it provides accurate estimates of past disease dynamics and key epidemiological parameters from genealogies with or without accompanying time series data.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Simulated infection dynamics and time series used to test the particle MCMC algorithm.
(A) Disease dynamics (I) obtained by simulating from the SIR process model (equations 10) over a 4-year period. (B) Corresponding time series of monthly incidence reports simulated from the observation model (equation 11). Parameters used in the simulation of the process model were: formula image = 3/month, formula image = 10, formula image = 0.16, and F = 0.012. Other process model parameters that were assumed to be known were: formula image = 0.0017/month, and N = 5 million. Parameters used in the simulation of the time series data were: formula image = 0.43, and formula image = 15.
Figure 2
Figure 2. Posterior densities of estimated model parameters.
Frequency histograms representing the marginal posterior densities of the SIR model parameters obtained using the particle MCMC algorithm. Vertical blue lines are placed at the true values of the parameters, solid red lines are the median value of the posterior densities and dashed red lines mark the 95% Bayesian credible intervals. From left to right, the parameters are the recovery rate formula image, the basic reproduction number formula image, the strength of seasonality formula image, the parameter scaling the strength of environmental noise F, the reporting rate formula image, and the observation variance formula image. (A–F) Parameters inferred using time series data. (G–J) Parameters inferred using a genealogy. Parameters formula image and formula image cannot be inferred using only a genealogy because they are parameters associated with the time series observation model. (K–P) Parameters inferred using both a genealogy and time series.
Figure 3
Figure 3. Posterior densities for disease prevalence over time.
Series of posterior densities for disease prevalence I over time obtained using particle MCMC. Blue lines represent the exact simulated prevalence, black lines are the median of the posterior density and dashed red lines represent the 95% credible intervals. (A) Prevalence inferred from time series data. (B) Prevalence inferred from a genealogy. (C) Prevalence inferred from both a genealogy and time series.
Figure 4
Figure 4. Posterior densities of parameters under epidemic conditions.
Posterior densities of the parameters formula image and formula image estimated from 100 independent genealogies obtained from simulated epidemic dynamics. (A–B) Frequency histograms representing the marginal posterior densities of formula image and formula image obtained from a single representative simulation. (C) The distribution of the median values of the posterior densities of formula image and formula image in parameter space for all 100 simulations (open red circles). The solid blue circle marks the true values of the parameters. Note that in our model formulation, formula image and formula image are independent parameters, with the transmission rate computed as formula image.
Figure 5
Figure 5. Simulated genealogy used to test the particle MCMC algorithm.
Genealogy obtained from the simulated disease dynamics shown in Figure 1A. The genealogy contains 200 terminal nodes corresponding to sequence samples being collected sequentially over time with yearly sample sizes of approximately 50 sequences. Sampling events were chosen to occur at random times over the entire interval of the times series.

References

    1. Cauchemez S, Ferguson NM. Likelihood-based estimation of continuous-time epidemic models from time-series data: application to measles transmission in London. J R Soc Interface. 2008;5:885–897. - PMC - PubMed
    1. O'Neill PD. Introduction and snapshot review: Relating infectious disease transmission models to data. Stat Med. 2010;29:2069–2077. - PubMed
    1. O'Neill PD, Roberts GO. Bayesian inference for partially observed stochastic epidemics. J Roy Stat Soc A Sta. 1999;162:121–129.
    1. Drummond AJ, Rambaut A, Shapiro B, Pybus OG. Bayesian coalescent inference of past population dynamics from molecular sequences. Mol Biol Evol. 2005;22:1185–1192. - PubMed
    1. Kuhner MK, Yamato J, Felsenstein J. Maximum likelihood estimation of population growth rates based on the coalescent. Genetics. 1998;149:429–434. - PMC - PubMed

Publication types