Inference for nonlinear epidemiological models using genealogies and time series
- PMID: 21901082
- PMCID: PMC3161897
- DOI: 10.1371/journal.pcbi.1002136
Inference for nonlinear epidemiological models using genealogies and time series
Abstract
Phylodynamics - the field aiming to quantitatively integrate the ecological and evolutionary dynamics of rapidly evolving populations like those of RNA viruses - increasingly relies upon coalescent approaches to infer past population dynamics from reconstructed genealogies. As sequence data have become more abundant, these approaches are beginning to be used on populations undergoing rapid and rather complex dynamics. In such cases, the simple demographic models that current phylodynamic methods employ can be limiting. First, these models are not ideal for yielding biological insight into the processes that drive the dynamics of the populations of interest. Second, these models differ in form from mechanistic and often stochastic population dynamic models that are currently widely used when fitting models to time series data. As such, their use does not allow for both genealogical data and time series data to be considered in tandem when conducting inference. Here, we present a flexible statistical framework for phylodynamic inference that goes beyond these current limitations. The framework we present employs a recently developed method known as particle MCMC to fit stochastic, nonlinear mechanistic models for complex population dynamics to gene genealogies and time series data in a Bayesian framework. We demonstrate our approach using a nonlinear Susceptible-Infected-Recovered (SIR) model for the transmission dynamics of an infectious disease and show through simulations that it provides accurate estimates of past disease dynamics and key epidemiological parameters from genealogies with or without accompanying time series data.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
= 3/month,
= 10,
= 0.16, and F = 0.012. Other process model parameters that were assumed to be known were:
= 0.0017/month, and N = 5 million. Parameters used in the simulation of the time series data were:
= 0.43, and
= 15.
, the basic reproduction number
, the strength of seasonality
, the parameter scaling the strength of environmental noise F, the reporting rate
, and the observation variance
. (A–F) Parameters inferred using time series data. (G–J) Parameters inferred using a genealogy. Parameters
and
cannot be inferred using only a genealogy because they are parameters associated with the time series observation model. (K–P) Parameters inferred using both a genealogy and time series.
and
estimated from 100 independent genealogies obtained from simulated epidemic dynamics. (A–B) Frequency histograms representing the marginal posterior densities of
and
obtained from a single representative simulation. (C) The distribution of the median values of the posterior densities of
and
in parameter space for all 100 simulations (open red circles). The solid blue circle marks the true values of the parameters. Note that in our model formulation,
and
are independent parameters, with the transmission rate computed as
.
References
-
- O'Neill PD. Introduction and snapshot review: Relating infectious disease transmission models to data. Stat Med. 2010;29:2069–2077. - PubMed
-
- O'Neill PD, Roberts GO. Bayesian inference for partially observed stochastic epidemics. J Roy Stat Soc A Sta. 1999;162:121–129.
-
- Drummond AJ, Rambaut A, Shapiro B, Pybus OG. Bayesian coalescent inference of past population dynamics from molecular sequences. Mol Biol Evol. 2005;22:1185–1192. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials
Miscellaneous
