Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Mar 11:arXiv:2009.02654v3.

Semi-parametric modeling of SARS-CoV-2 transmission using tests, cases, deaths, and seroprevalence data

Affiliations

Semi-parametric modeling of SARS-CoV-2 transmission using tests, cases, deaths, and seroprevalence data

Damon Bayer et al. ArXiv. .

Abstract

Mechanistic models fit to streaming surveillance data are critical to understanding the transmission dynamics of an outbreak as it unfolds in real-time. However, transmission model parameter estimation can be imprecise, and sometimes even impossible, because surveillance data are noisy and not informative about all aspects of the mechanistic model. To partially overcome this obstacle, Bayesian models have been proposed to integrate multiple surveillance data streams. We devised a modeling framework for integrating SARS-CoV-2 diagnostics test and mortality time series data, as well as seroprevalence data from cross-sectional studies, and tested the importance of individual data streams for both inference and forecasting. Importantly, our model for incidence data accounts for changes in the total number of tests performed. We model the transmission rate, infection-to-fatality ratio, and a parameter controlling a functional relationship between the true case incidence and the fraction of positive tests as time-varying quantities and estimate changes of these parameters nonparametrically. We compare our base model against modified versions which do not use diagnostics test counts or seroprevalence data to demonstrate the utility of including these often unused data streams. We apply our Bayesian data integration method to COVID-19 surveillance data collected in Orange County, California between March 2020 and February 2021 and find that 32-72% of the Orange County residents experienced SARS-CoV-2 infection by mid-January, 2021. Despite this high number of infections, our results suggest that the abrupt end of the winter surge in January 2021 was due to both behavioral changes and a high level of accumulated natural immunity.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
COVID-19 surveillance data from Orange County, CA. The figure shows weekly counts of tests, cases (positive tests), reported deaths due to COVID-19, as well as testing positivity.
Figure 2:
Figure 2:
Model diagram depicting possible progressions between infection states. The model compartments are as follows: susceptible (S), infected, but not yet infectious (E), infectious (I), recovered (R), and deceased (D).
Figure 3:
Figure 3:
Posterior distributions of the time-varying basic reproductive number R0, effective reproductive number Re, infection-to-fatality ratio (IFR), proportion in the proportional log-odds model of the beta-binomial observational model for cases α, weekly latent:case ratio, and cumulative latent:case ratio. Solid blue lines show point-wise posterior medians, while shaded areas denote 50%, 80%, and 95% Bayesian credible intervals.
Figure 4:
Figure 4:
Posterior inference for the effective reproduction number from the full model and epidemia fit to the Orange County data.
Figure 5:
Figure 5:
Latent and observed cumulative death (left) and incidence (center) trajectories and latent prevalence trajectories (right) in Orange County, CA (population 3.2 million). Solid blue lines show point-wise posterior medians, while shaded areas denote 50%, 80%, and 95% Bayesian credible intervals. Black circles denote observed data. Note that the posterior predictive distributions are of latent deaths and cases are not forecasts of their observed counterparts. Forecasts are plotted in Figure 6.
Figure 6:
Figure 6:
Forecast distributions for observed deaths (left column) and testing positivity (right column). Solid blue lines show point-wise posterior medians, while shaded areas denote 50%, 80%, and 95% Bayesian credible intervals. Observed values are presented as black circles.
Figure 7:
Figure 7:
Comparison of Continuous Rank Probability Score for models fit to the Orange County data. Lower is better.

References

    1. Anderson S. C., Edwards A. M., Yerlanov M., Mulberry N., Stockdale J. E., Iyaniwura S. A., Falcao R. C., Otterstatter M. C., Irvine M. A., Janjua N. Z., Coombs D., and Colijn C. (2020), “Quantifying the impact of COVID-19 control measures using a Bayesian model of physical distancing,” PLOS Computational Biology, 16, 1–15. - PMC - PubMed
    1. Andrieu C., Doucet A., and Holenstein R. (2010), “Particle Markov chain Monte Carlo methods,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72, 269–342.
    1. Bhargava A., Fukushima E. A., Levine M., Zhao W., Tanveer F., Szpunar S. M., and Saravolatz L. (2020), “Predictors for severe COVID-19 infection,” Clinical Infectious Diseases, 71, 1962–1968. - PMC - PubMed
    1. Bosse N. I., Gruson H., Cori A., van Leeuwen E., Funk S., and Abbott S. (2022), “Evaluating forecasts with scoringutils in R,” arXiv preprint arXiv:2205.07090.
    1. Bretó C., He D., Ionides E., and King A. (2009), “Time series analysis via mechanistic models,” The Annals of Applied Statistics, 3, 319–348.

Publication types