Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Sep;32(9):2483-95.
doi: 10.1093/molbev/msv123. Epub 2015 May 25.

Phylodynamic Inference with Kernel ABC and Its Application to HIV Epidemiology

Affiliations

Phylodynamic Inference with Kernel ABC and Its Application to HIV Epidemiology

Art F Y Poon. Mol Biol Evol. 2015 Sep.

Abstract

The shapes of phylogenetic trees relating virus populations are determined by the adaptation of viruses within each host, and by the transmission of viruses among hosts. Phylodynamic inference attempts to reverse this flow of information, estimating parameters of these processes from the shape of a virus phylogeny reconstructed from a sample of genetic sequences from the epidemic. A key challenge to phylodynamic inference is quantifying the similarity between two trees in an efficient and comprehensive way. In this study, I demonstrate that a new distance measure, based on a subset tree kernel function from computational linguistics, confers a significant improvement over previous measures of tree shape for classifying trees generated under different epidemiological scenarios. Next, I incorporate this kernel-based distance measure into an approximate Bayesian computation (ABC) framework for phylodynamic inference. ABC bypasses the need for an analytical solution of model likelihood, as it only requires the ability to simulate data from the model. I validate this "kernel-ABC" method for phylodynamic inference by estimating parameters from data simulated under a simple epidemiological model. Results indicate that kernel-ABC attained greater accuracy for parameters associated with virus transmission than leading software on the same data sets. Finally, I apply the kernel-ABC framework to study a recent outbreak of a recombinant HIV subtype in China. Kernel-ABC provides a versatile framework for phylodynamic inference because it can fit a broader range of models than methods that rely on the computation of exact likelihoods.

Keywords: approximate Bayesian computation; human immunodeficiency virus; molecular epidemiology; phylodynamics; tree shape; virus evolution.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.
Fig. 1.
(A) Response of Sackin’s index (IS) to the varying heterogeneity in contact rates. Each set of box-and-whisker plots summarizes the distribution of IS for 100 replicate trees simulated under different values of c1 (c1=c2=1 is shaded for reference). Varying c1 away from c2=1 had a more pronounced effect under preferential (ρ=0.9) than proportional (ρ = 0) mixing. (B) Projection of simulated trees to kernel space. Sets of 100 replicate trees are each annotated in the plot with their corresponding c1 values. The proportion of variation explained by the first two principal components, as estimated from the eigenvalues, is reported by the respective axis labels.
F<sc>ig</sc>. 2.
Fig. 2.
Comparison of relative errors for posterior estimates of BDSIR model parameters obtained by kernel-ABC and BEAST2. Each box-and-whisker plot summarizes the parameter distribution sampled from the posterior density. The relative error was calculated as (x^x)/x where x^ is the estimated value and x is the true value; hence, a relative error of −0.9 corresponds to a 10-fold underestimate. These quantities were offset by 1 for rendering on a log-transformed scale. The vertical span of boxes corresponds to the interquartile range and the whiskers extend to the empirical 95% credible intervals. Population size corresponds to the total number of susceptible, infected and removed individuals. Detailed results are provided in supplementary table S1, Supplementary Material online.
F<sc>ig</sc>. 3.
Fig. 3.
Histograms summarizing the distribution of c1 (contact rate, group 1) estimated by a kernel-ABC analysis on trees simulated with c1=0.5 (blue) and c1=2.0 (red, hatched). The true parameter settings and median estimates are indicated by solid and dashed white vertical lines, respectively. Both runs were initialized at c1=1. The simulated annealing tolerance parameter was set to decay from 0.005 to 0.002.
F<sc>ig</sc>. 4.
Fig. 4.
Reconstructed dynamics of a recent outbreak of HIV CRF07_BC in China. Numbers of susceptible (gray) and infected (red) individuals over time were obtained by forward-time simulation in MASTER under the birth–death SIR model, using parameter values sampled by kernel-ABC. Shaded areas indicate the interquartile regions of the respective counts. Note that the infected counts include removed individuals from simulated populations. A solid vertical line indicates the median tree height that approximates the date of the most recent sample in the data set (2010). Dashed lines indicate the interquartile range in tree heights. Simulation time zero corresponds to the estimated origin of the epidemic (1993–1994).

Similar articles

Cited by

References

    1. Aizerman A, Braverman EM, Rozoner LI. 1964. Theoretical foundations of the potential function method in pattern recognition learning. Autom Remote Control. 25:821–837.
    1. Blum MGB, François O. 2005. On statistical tests of phylogenetic tree imbalance: the Sackin and other indices revisited. Math Biosci. 195:141–53. - PubMed
    1. Bouckaert R, Heled J, Kühnert D, Vaughan T, Wu CH, Xie D, Suchard MA, Rambaut A, Drummond AJ. 2014. BEAST 2: a software platform for Bayesian evolutionary analysis. PLoS Comput Biol. 10:e1003537. - PMC - PubMed
    1. Colijn C, Gardy J. 2014. Phylogenetic tree shapes resolve disease transmission patterns. Evol Med Public Health. 2014:96–108. - PMC - PubMed
    1. Colless DH. 1982. Review of “Phylogenetics: the theory and practice of phylogenetic systematics.” Syst Zool. 31:100–104.

Publication types