Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jul;31(7):1869-79.
doi: 10.1093/molbev/msu121. Epub 2014 Apr 8.

Bayesian inference of infectious disease transmission from whole-genome sequence data

Affiliations

Bayesian inference of infectious disease transmission from whole-genome sequence data

Xavier Didelot et al. Mol Biol Evol. 2014 Jul.

Abstract

Genomics is increasingly being used to investigate disease outbreaks, but an important question remains unanswered--how well do genomic data capture known transmission events, particularly for pathogens with long carriage periods or large within-host population sizes? Here we present a novel Bayesian approach to reconstruct densely sampled outbreaks from genomic data while considering within-host diversity. We infer a time-labeled phylogeny using Bayesian evolutionary analysis by sampling trees (BEAST), and then infer a transmission network via a Monte Carlo Markov chain. We find that under a realistic model of within-host evolution, reconstructions of simulated outbreaks contain substantial uncertainty even when genomic data reflect a high substitution rate. Reconstruction of a real-world tuberculosis outbreak displayed similar uncertainty, although the correct source case and several clusters of epidemiologically linked cases were identified. We conclude that genomics cannot wholly replace traditional epidemiology but that Bayesian reconstructions derived from sequence data may form a useful starting point for a genomic epidemiology investigation.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.
Fig. 1.
The colored genealogical tree (left). Each host isolate corresponds to a unique color. A lineage is colored according to the host it was in at the corresponding time. When a lineage changes from color ci to cj (forward in time), this represents i infecting j. Each color may not persist in the tree after the time of the corresponding tip, because this is the recovery time of the host. The subtree restricted to a single color (right) is the part of the tree inside the corresponding host; lineages are taken from this tree at the recovery time and the times when the host infected others.
F<sc>ig</sc>. 2.
Fig. 2.
(A) Transmission tree simulated with N = 100, formula image, and formula image. The numbers in square brackets represent the time of infection and removal of each individual, respectively. (B) Genealogical tree simulated conditional on the transmission tree in (A) and with parameter formula image. Transmission events are indicated by red stars and a change in branch color. (C) Same as in (B) but with parameter formula image.
F<sc>ig</sc>. 3.
Fig. 3.
(A) Network representation of the posterior distribution of transmission trees for the simulated data set shown in figure 2C. Nodes represent hosts, and the numbers within square brackets to mean inferred infection time and the known removal time. Edges represent a posterior probability of transmission of at least 10%, with a darker edge indicating a higher probability. (B) Point estimate of the transmission tree obtained by taking the optimal branching tree in the network shown in part (A).
F<sc>ig</sc>. 4.
Fig. 4.
Application to a real-world tuberculosis outbreak. (A) Phylogenetic tree inferred by BEAST. (B) SNVs differentiating the isolates. (C) Transmission network inferred without epidemiological modification. (D) Transmission network inferred with epidemiological modification. In (C) and (D), edges are shown with width and shading proportional to their posterior probability, except edges with low probability that are omitted. In (A), the phylogenetic tree is colored according to the consensus transmission tree from (D) and using the same unique colors for each host as in (C) and (D).

References

    1. Alizon S, Luciani F, Regoes RR. Epidemiological and clinical consequences of within-host evolution. Trends Microbiol. 2011;19(1):24–32. - PubMed
    1. Allen L. An introduction to stochastic epidemic models. In: Brauer F, van den Driessche P, Wu J, editors. Mathematical epidemiology. 2008. (Lecture notes in mathematics. vol. 1945). Berlin: Springer. p. 81–130.
    1. Barabási A-L, Oltvai ZN. understanding the cell’s functional organization. Nat Rev Genet. 2004;5(2):101–113. - PubMed
    1. Bergstrom CT, McElhany P, Real LA. Transmission bottlenecks as determinants of virulence in rapidly evolving pathogens. Proc Natl Acad Sci U S A. 1999;96(9):5095–6100. - PMC - PubMed
    1. Boeras DI, Hraber PT, Hurlston M, Evans-Strickfaden T, Bhattacharya T, Giorgi EE, Mulenga J, Karita E, Korber BT, Allen S, et al. Role of donor genital tract HIV-1 diversity in the transmission bottleneck. Proc Natl Acad Sci U S A. 2011;108(46):E1156–E1163. - PMC - PubMed

Publication types