Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun 3;18(6):e1010115.
doi: 10.1371/journal.pcbi.1010115. eCollection 2022 Jun.

Addressing delayed case reporting in infectious disease forecast modeling

Affiliations

Addressing delayed case reporting in infectious disease forecast modeling

Lauren J Beesley et al. PLoS Comput Biol. .

Abstract

Infectious disease forecasting is of great interest to the public health community and policymakers, since forecasts can provide insight into disease dynamics in the near future and inform interventions. Due to delays in case reporting, however, forecasting models may often underestimate the current and future disease burden. In this paper, we propose a general framework for addressing reporting delay in disease forecasting efforts with the goal of improving forecasts. We propose strategies for leveraging either historical data on case reporting or external internet-based data to estimate the amount of reporting error. We then describe several approaches for adapting general forecasting pipelines to account for under- or over-reporting of cases. We apply these methods to address reporting delay in data on dengue fever cases in Puerto Rico from 1990 to 2009 and to reports of influenza-like illness (ILI) in the United States between 2010 and 2019. Through a simulation study, we compare method performance and evaluate robustness to assumption violations. Our results show that forecasting accuracy and prediction coverage almost always increase when correction methods are implemented to address reporting delay. Some of these methods required knowledge about the reporting error or high quality external data, which may not always be available. Provided alternatives include excluding recently-reported data and performing sensitivity analysis. This work provides intuition and guidance for handling delay in disease case reporting and may serve as a useful resource to inform practical infectious disease forecasting efforts.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Proportion of eventually-reported cases that were reported in each week1.
1 Estimates of πts(d) − πts(d − 1), obtained using Eq 6 and stratified by season/year.
Fig 2
Fig 2. Summary of methods for handling reporting delay.
Fig 3
Fig 3. Performance of nowcasts and forecasts in the Puerto Rico dengue fever and national US influenza-like illness data across all weeks 1.
1 Results for dengue fever are aggregated across each of 50 weeks in 18 seasons (1992–2009). Results for US national influenza are aggregated across 35 weeks in 7 seasons. The ensemble method corresponds to an equal-weight linear combination of all methods except validation data analysis. “Model” indicates that reporting factors were estimated via regression and allowed to vary by t and s. “Lag” indicates that reporting factors were estimated via Eq 6. “Local” indicates that reporting factors were estimated via Eq 8. Relative absolute biases are calculated relative to the largest value in each column.
Fig 4
Fig 4. Proportion of weeks in which each of 4 methods performs best in terms of 1-week forecast weighted interval scores.
1 Results for Dengue fever in Puerto Rico are aggregated across all 50 weeks in 18 calendar years (1992–2009). Results for US national influenza are aggregated across 35 weeks in 7 seasons (2012–2013 through 2018–2019). Results for US state-level influenza are aggregated across 35 weeks in 49 states (excluding Florida) for the 2018–2019 season.
Fig 5
Fig 5. Performance of proposed methods for handling reporting delay in 2009 across 100 simulated datasets using ARMA models1.
1 Results are aggregated across all 50 weeks in 100 replicate seasons. Each result, therefore, represents aggregates 5000 nowcasts or forecasts. When reporting factors varied across seasons, π2007 = π2008 = (0.01, 0.05, 0.55, 0.85, 0.95, 0.98, 1) and π2009 = (0.04, 0.54, 0.84, 0.0.94, 0.97, 0.99, 1). The ensemble method corresponds to an equal-weight linear combination of all methods except validation data analysis and exclusions of 4 and 5 weeks’ data. “Model” indicates that reporting factors were estimated via regression and allowed to vary by t and s. “Lag” indicates that reporting factors were estimated via Eq 6. “Local” indicates that reporting factors were estimated via Eq 8. Relative absolute biases are calculated relative to the largest value in each column. Model-based πts(d) estimation assumes reporting factors vary across weeks but incorrectly models how reporting factors vary across weeks.
Fig 6
Fig 6. Coverage of 95% prediction intervals for 1 week forecasts across various assumed reporting profiles in 2008 and 2009 simulated datasets1.
1 Circled coverages correspond to correctly-specified reporting factors.

References

    1. Lutz CS, Huynh MP, Schroeder M, Anyatonwu S, Dahlgren FS, Danyluk G, et al.. Applying infectious disease forecasting to public health: A path forward using influenza forecasting examples. BMC Public Health. 2019;19(1):1–12. doi: 10.1186/s12889-019-7966-8 - DOI - PMC - PubMed
    1. Jajosky RA, Groseclose SL. Evaluation of reporting timeliness of public health surveillance systems for infectious diseases. BMC Public Health. 2004;4(29):1–9. doi: 10.1186/1471-2458-4-29 - DOI - PMC - PubMed
    1. McGough SF, Johansson MA, Lipsitch M, Menzies NA. Nowcasting by Bayesian smoothing: A flexible, generalizable model for real-time epidemic tracking. PLoS Computational Biology. 2020;16(4):1–20. doi: 10.1371/journal.pcbi.1007735 - DOI - PMC - PubMed
    1. Abbott S, Hellewell J, Thompson RN, Sherratt K, Gibbs HP, Bosse NI, et al.. Estimating the time-varying reproduction number of SARS-CoV-2 using national and subnational case counts. Wellcome Open Research. 2020;5:112. doi: 10.12688/wellcomeopenres.16006.2 - DOI
    1. Bracher J, Wolffram D, Deuschel J, Görgen K, Ketterer JL, Ullrich A, et al.. A pre-registered short-term forecasting study of COVID-19 in Germany and Poland during the second wave. Nature communications. 2021;12(1):5173. doi: 10.1038/s41467-021-25207-0 - DOI - PMC - PubMed

Publication types