Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 18;23(3):807-824.
doi: 10.1093/biostatistics/kxaa059.

Estimation of the generation interval using pairwise relative transmission probabilities

Affiliations

Estimation of the generation interval using pairwise relative transmission probabilities

Sarah V Leavitt et al. Biostatistics. .

Abstract

The generation interval (the time between infection of primary and secondary cases) and its often used proxy, the serial interval (the time between symptom onset of primary and secondary cases) are critical parameters in understanding infectious disease dynamics. Because it is difficult to determine who infected whom, these important outbreak characteristics are not well understood for many diseases. We present a novel method for estimating transmission intervals using surveillance or outbreak investigation data that, unlike existing methods, does not require a contact tracing data or pathogen whole genome sequence data on all cases. We start with an expectation maximization algorithm and incorporate relative transmission probabilities with noise reduction. We use simulations to show that our method can accurately estimate the generation interval distribution for diseases with different reproductive numbers, generation intervals, and mutation rates. We then apply our method to routinely collected surveillance data from Massachusetts (2010-2016) to estimate the serial interval of tuberculosis in this setting.

Keywords: Hierarchical clustering; Kernel density estimation; Noise reduction; Reproductive number; Serial interval; Tuberculosis.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Plots of two example individuals to demonstrate clustering methods, one (left: case A) that has a high probability cluster of infectors (colored in black) and one (right: case B) that does not. The top row shows a scatter-plot of the naive Bayes transmission probabilities for all possible infectors of two individuals. The middle row shows the corresponding dendrograms, using a clustering cutoff of 0.05. The bottom row shows the kernel density estimates for the infectors of individuals in A and B, respectively, using a binwidth of 0.01.
Fig. 2
Fig. 2
Violin plots of the absolute bias in days for the mean (dark grey), median (medium grey), and standard deviation (light grey) of the generation interval distribution estimated by various methods for the nine different simulation scenarios: baseline, low, and high sample sizes (LowN, HighN), low and high reproductive numbers (LowR, HighR), low and high mutation rates (LowMR, HighMR), and low and high generation interval variances (LowGIV, HighGIV), described in detail in Supplementary Table 1 available at Biostatistics online. The absolute bias equals the observed value minus the true value and is in days. For PEM: top N, PEM: Hierarchical, and PEM: Kernel Density, the pooled results are shown. For the SNP distance method, but no other method the bias estimates for multiple scenarios extend above 10 days (the upper limit for this plot) to as high as 33 days. The plot is truncated here in order to better visualize the results of the other estimation methods.
Fig. 3
Fig. 3
Estimates of the mean, median, and standard deviation for the serial interval of TB in Massachusetts between 2010 and 2016 estimated from relative transmission probabilities with 95% bootstrap confidence intervals. The left panels shows the results when clustering the infectors using hierarchical clustering with various cutoffs and the right panels with kernel density estimation with various binwidths. The solid horizontal lines show the pooled estimates (averaging over all cutoffs/binwidths) with their 95% confidence intervals as dotted lines. The blue lines show the estimates from an unmodified gamma distribution with no restriction on the serial interval, the green using a shifted gamma distribution forcing the serial interval to be greater than 1 month and the red using a shifted gamma distribution forcing the serial interval to be greater than 2 months.

References

    1. Becker, N. G., Wang, D. and Clements, M. (2010). Type and quantity of data needed for an early estimate of transmissibility when an infectious disease emerges. Eurosurveillance 15, 1–6. - PubMed
    1. Borgdorff, M. W., Sebek, M., Geskus, R. B., Kremer, K., Kalisvaart, N. and van Soolingen, D. (2011). The incubation period distribution of tuberculosis estimated with a molecular epidemiological approach. International Journal of Epidemiology 40, 964–970. - PubMed
    1. Britton, T. and Tomba, G. S. (2019). Estimation in emerging epidemics: biases and remedies. Journal of the Royal Society Interface 16, 1–10. - PMC - PubMed
    1. Brooks-Pollock, E., Becerra, M. C., Goldstein, E., Cohen, T. and Murray, M. B. (2011). Epidemiologic inference from the distribution of tuberculosis cases in households in Lima, Peru. The Journal of Infectious Diseases 203, 1582–1589. - PMC - PubMed
    1. Campbell, F., Cori, A., Ferguson, N., Baker, S. and Jombart, T. (2019). Bayesian inference of transmission chains using timing of symptoms, pathogen genomes and contact data. PLoS Computational Biology 15, 1–20. - PMC - PubMed

Publication types