Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2013 Sep;5(3):131-45.
doi: 10.1016/j.epidem.2013.05.002. Epub 2013 Jun 3.

Comparing methods for estimating R0 from the size distribution of subcritical transmission chains

Affiliations
Comparative Study

Comparing methods for estimating R0 from the size distribution of subcritical transmission chains

S Blumberg et al. Epidemics. 2013 Sep.

Abstract

Many diseases exhibit subcritical transmission (i.e. 0<R0<1) so that infections occur as self-limited 'stuttering chains'. Given an ensemble of stuttering chains, information about the number of cases in each chain can be used to infer R0, which is of crucial importance for monitoring the risk that a disease will emerge to establish endemic circulation. However, the challenge of imperfect case detection has led authors to adopt a variety of work-around measures when inferring R0, such as discarding data on isolated cases or aggregating intermediate-sized chains together. Each of these methods has the potential to introduce bias, but a quantitative comparison of these approaches has not been reported. By adapting a model based on a negative binomial offspring distribution that permits a variable degree of transmission heterogeneity, we present a unified analysis of existing R0 estimation methods. Simulation studies show that the degree of transmission heterogeneity, when improperly modeled, can significantly impact the bias of R0 estimation methods designed for imperfect observation. These studies also highlight the importance of isolated cases in assessing whether an estimation technique is consistent with observed data. Analysis of data from measles outbreaks shows that likelihood scores are highest for models that allow a flexible degree of transmission heterogeneity. Aggregating intermediate sized chains often has similar performance to analyzing a complete chain size distribution. However, truncating isolated cases is beneficial only when surveillance systems clearly favor full observation of large chains but not small chains. Meanwhile, if data on the type and proportion of cases that are unobserved were known, we demonstrate that maximum likelihood inference of R0 could be adjusted accordingly. This motivates the need for future empirical and theoretical work to quantify observation error and incorporate relevant mechanisms into stuttering chain models used to estimate transmission parameters.

Keywords: Basic reproductive number; Imperfect observation; Measles; Stuttering chain; Transmission heterogeneity.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Inference of R0 and k for measles data
A) Distribution of chain sizes for measles data in the United States (1997–1999) and Canada (1998–2001). B) Weighted distribution of the same data, showing the distribution of cases according to the size of the chain they belong to. C) Results of inferring R0 and k for measles in the United States. Markers depict the maximum likelihood estimates (MLE) as determined by five different approaches. The full-distribution MLE assumes a negative binomial offspring distribution and uses the complete chain size distribution to infer both R0 and k. The contour line shows the corresponding 95% confidence region. The truncated estimates only consider chains that are size two or greater. The aggregated estimates are based on the number of isolated cases, the total number of chains and the size of the largest chain. Both the truncated and aggregated estimates assume either a geometric offspring distribution with k = 1 (lower cross marks) or a Poisson offspring distribution with k → ∞ (higher cross marks). D) Analogous to panel C, but for measles in Canada.
Figure 2
Figure 2. Absolute bias of R0 estimation associated with independent observation error
Simulated observation scenarios used to measure absolute bias are created by assuming that each case has an independent and identical probability of being detected. The curves in each panel show how the bias varies as a function of the observation probability and estimator choice. Results are shown for the 0,MLE, 0−T,k=1, 0−T,k→∞, 0−A,k=1 and 0−A,k→∞ estimators. Each panel corresponds to a different pair of true R0 and k values.
Figure 3
Figure 3. Absolute bias of R0 estimation associated with size-dependent observation error
The panels are analogous to those in figure 2 except that each case has an identical and independent probability of being a sentinel case that activates complete observation of the chain they are part of. Chains without a sentinel case are not observed at all. The observation probability plotted here is the overall probability that a randomly chosen case is observed, which can be significantly higher than the probability of being a sentinel case (equation 6).
Figure 4
Figure 4. Inference of R0 when imperfect observation is incorporated into the likelihood calculation
A) The 95% contour and ML value for R0 and k for measles in the USA is shown for three different assumptions about the observation process. The black contour assumes all cases are observed. The green contour assumes each case is observed with an independent probability of 50%. The blue contour assumes each case has a 50% probability of being a sentinel case that activates complete observation of the chain it is part of. B) Analogous to panel A but for measles in Canada.
Figure A.5
Figure A.5. Evaluating R0 estimators when data sets are truncated by ignoring isolated cases
A) Size of stuttering chains as a function of R0. The average size of all chains, μ, is independent of k (black line, equation 2). Meanwhile, the average size of chains containing at least one secondary infection, μs, depends on k (colored lines, equation 8 with pobs = 1). B) Root mean square absolute error of the R0 estimate (equation 14) as a function of the true dispersion parameter. The 0,MLE estimator (which uses the full chain size distribution) is shown for reference. As explained in the text, the truncated estimators differ only in the way they model transmission heterogeneity. The colored lines correspond to 0−T,k=? (blue) 0−T,k=1 (green), 0−T,k→∞ (red), and 0−T,k=k (magenta) and are all based on equation 11. The true R0 is fixed at 0.5. The qualitative behavior is similar for different values of R0 (data not shown). C) Fraction of the absolute error shown in panel B that is due to estimation bias (equation 18). D) Coverage probability of the 95% confidence intervals for the same estimators and transmission parameters shown in panel B. In panels B–D, each data point is based on 2000 simulations. To minimize the effects of sampling variance, N = 1000 for all simulations. The full-distribution (0,MLE) results are hidden in panels B–D because they are essentially identical to the performance of 0−T,k=k.
Figure A.6
Figure A.6. Evaluating R0 estimators in which intermediate stuttering chain sizes are aggregated together
A) Across many simulated chain size distributions, the median size of the largest chain depends on R0, k and the number of observed chains, N. B–D) Analogous to panels B–D in figure A.5 except that the colored curves correspond to the aggregated estimators based on equation 12: 0−A,k=? (blue) 0−A,k=1 (green), 0−A,k→∞ (red), and 0−A,k=k (magenta). The full-distribution results are hidden in panels C and D, because they are essentially identical to the aggregated results when k is either known or inferred. The binomial estimator, 0,binomial, which has the same value but different confidence intervals than the full-distribution estimator, is based solely on the fraction of cases that are secondary (equation 13, cyan line in panel D).
Figure A.7
Figure A.7. Comparing R0 estimators when simulations are based on a Weibull-Poisson offspring distribution
A–C) Analogous to figure A.5B–D but based on a ‘matched’ Weibull-Poisson distribution. D–F) Analogous to figure A.6B–D.
Figure B.8
Figure B.8. Comparing model predictions to data
A) Comparing United States measles data to predictions based on the complete chain size distribution (0,MLE) versus a truncated distribution (0−T,k=1 and 0−T,k→∞). Errors bars show 95% confidence intervals for proportions of each chain size in the data. Table 2 contains the likelihood scores for each set of predictions. B) Similar to panel A but the 0,MLE model is now compared to predictions based on aggregating intermediate chain sizes (0−A,k=1 and 0−A,k→∞). C) ML values of R0 for the models shown in panels A and B along with the 95% confidence intervals. Green confidence intervals correspond to a geometric distribution, red confidence intervals are for a Poisson offspring distribution and blue confidence intervals are for when k is inferred. (The numeric values for the confidence intervals are all provided in section 3.1.) D–F) Analogous to panels A, B and C, except that estimator predictions are compared to data for measles in Canada. For the 0,MLE, 0−T,K=? and 0−A,K=? estimators, the associated values are 0.32, ∞ and 0.27 for measles in the United States. The corresponding values are 0.21, 0.23 and 0.20 for measles in Canada.
Figure C.9
Figure C.9. Model predictions for the generation of extinction
A) Predictions for the generation of extinction for measles in the United States. The predictions are based on the inferred value of R0 and either the assumed or inferred value of k for some of the estimators presented in the main text. The reported data are also shown along with 95% confidence intervals for the proportion of chains that went extinct in each generation. B) Analogous to panel A but for measles in Canada. Unfortunately, data are not available for comparison to model predictions.

References

    1. Antia R, Regoes RR, Koella JC, Bergstrom CT. The role of evolution in the emergence of infectious diseases. Nature. 2003;426:8–11. - PMC - PubMed
    1. Arinaminpathy N, McLean AR. Evolution and emergence of novel human infections. Proceedings of the Royal Society B: Biological Sciences. 2009;276:3937–43. - PMC - PubMed
    1. Ball FG, Britton T, O’Neill PD. Empty confidence sets for epidemics, branching processes and brownian motion. Biometrika. 2002;89:211–224.
    1. Bauch CT, Earn DJD. Vaccination and the theory of games. Proceedings of the National Academy of Sciences of the United States of America. 2004;101:13391–4. - PMC - PubMed
    1. Becker N. On parametric estimation for mortal branching processes. Biometrika. 1974;61:393–399. http://biomet.oxfordjournals.org/content/61/2/393.full.pdf+html.

Publication types