Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan;22(222):20240456.
doi: 10.1098/rsif.2024.0456. Epub 2025 Jan 8.

State-space modelling using wastewater virus and epidemiological data to estimate reported COVID-19 cases and the potential infection numbers

Affiliations

State-space modelling using wastewater virus and epidemiological data to estimate reported COVID-19 cases and the potential infection numbers

Syun-Suke Kadoya et al. J R Soc Interface. 2025 Jan.

Abstract

The current situation of COVID-19 measures makes it difficult to accurately assess the prevalence of SARS-CoV-2 due to a decrease in reporting rates, leading to missed initial transmission events and subsequent outbreaks. There is growing recognition that wastewater virus data assist in estimating potential infections, including asymptomatic and unreported infections. Understanding the COVID-19 situation hidden behind the reported cases is critical for decision-making when choosing appropriate social intervention measures. However, current models implicitly assume homogeneity in human behaviour, such as virus shedding patterns within the population, making it challenging to predict the emergence of new variants due to variant-specific transmission or shedding parameters. This can result in predictions with considerable uncertainty. In this study, we established a state-space model based on wastewater viral load to predict both reported cases and potential infection numbers. Our model using wastewater virus data showed high goodness-of-fit to COVID-19 case numbers despite the dataset including waves of two distinct variants. Furthermore, the model successfully provided estimates of potential infection, reflecting the superspreading nature of SARS-CoV-2 transmission. This study supports the notion that wastewater surveillance and state-space modelling have the potential to effectively predict both reported cases and potential infections.

Keywords: SARS-CoV-2; prediction of COVID-19 infection; state-space modelling; wastewater-based epidemiology.

PubMed Disclaimer

Conflict of interest statement

We declare we have no competing interests.

Figures

The basic paradigm of the state-space model established in this study
Figure 1.
The basic paradigm of the state-space model established in this study composed of the state and the observation equations. Potential infection number Yt and potential wastewater virus load Xt at time t are the state that is not observed in any way. The transition of the state is explained using the process error ζX,t or ζY,t. The Xt is used as correction for Yt with coefficient φ. The observed data (virus load in wastewater: xt, reported case: yt) are generated from each state with limitation terms αt and βt that change over time.
Fitting state-space models (Model 1: A, Model 2: B, Model 3: C and Model 4: D)
Figure 2.
Fitting state-space models ((a) model 1; (b) model 2; (c) model 3; and (d) model 4) to the reported COVID-19 case (upper panels) and the posterior distribution of coefficients φ1 and φ2 (orange in (c)) (lower panels). In the upper panels, circles, lines and shade indicate observed datapoints, the estimated values (median) and 95% credible intervals, respectively. RMSE is the abbreviation of root-mean squared error. In the lower panels, circles with values on the top of distribution mean the median.
Fitting the state-space models
Figure 3.
Fitting the state-space models ((a) model 2; (b) model 3; (c) model 4) to the measured SARS-CoV-2 load in wastewater. Circles, blue lines and blue shade show the dataset, its model estimate (median value) and 95% credible interval. Red lines and shades are the estimated state value and its credible interval. RMSE is the root mean squared error.
The estimated limitation terms (upper) and (lower) for all models.
Figure 4.
The estimated limitation terms α (upper) and β (lower) for all models. Grey and green distributions are prior and posterior distributions, respectively. Model 1 does not employ the state and observation equation for wastewater virus load, so there is no estimated α. As for β, solid lines and shades indicate median values and 95% credible intervals.
Figure 5.
Figure 5.
Data splitting and the prediction performance of four state-space models. (a) The datasets of log10 SARS-CoV-2 genome copy in wastewater (green plot) and the reported COVID-19 case (grey plot) in Miyagi Prefecture in Japan were split into training and test datasets to evaluate how far the established model retained the prediction performance. White circles indicate the imputed wastewater viral load data. (b) In the manner of data splitting as (a), the predicted and reported cases were compared between models 1–4. The plots coloured with deeper green mean the results of near-future prediction. In models 2 and 3, the prediction for the reported case was performed using the predicted (A) or observed (B) wastewater viral load value, to check if the prediction performance using the predicted viral load value for the COVID-19 case was compatible with that using the observed one because application of the predicted virus value for the prediction of COVID-19 case was preferable for practical use. (c) The prediction for next week’s reported case was performed well as shown in (a) and (b). To verify the near-future prediction performance using several test datasets, the models were updated by increasing the number of training datasets, and the predictions for next week’s case were sequentially performed (trials 1–7). (d) The ratio of the predicted case number based on the data splitting approach (c) to the reported one was visualized as box plots (n = 7, white diamond plots mean the median).
The potential COVID−19 infection estimated by Models 2 and 4 was characterized based on three scenarios.
Figure 6.
The potential COVID-19 infection estimated by models 2 and 4 was characterized based on three scenarios. ((a) model 2; (b) model 4) The fold-difference value (the potential/reported case number) was compared with the E value (the expected total case number within a week originated from a primary infected person) simulated under each scenario based on the effective reproduction number. The dashed line means that the ratio of the fold difference to the E value is one. One of the three scenarios (scenario A) gave the smallest E values, and thus some fold difference/E values were high (blue circles). Under the superspreading-based scenario (scenario C), the fold difference/E value gathered around one (red circles). Scenario B showed relatively smaller E values than scenario C, but some deviations from one were found (white circles). ((c) model 2, (d) model 4) The predictability for the potential case numbers from one to seven weeks post-test dataset was evaluated based on the E value of scenario C.

References

    1. Zhou P, et al. 2020. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 579, 270–273. ( 10.1038/s41586-020-2012-7) - DOI - PMC - PubMed
    1. Mathieu E, Ritchie H, Ortiz-Ospina E, Roser M, Hasell J, Appel C, Giattino C, Rodés-Guirao L. 2021. A global database of COVID-19 vaccinations. Nat. Hum. Behav. 5, 947–953. ( 10.1038/s41562-021-01122-8) - DOI - PubMed
    1. Ko YK, Murayama H, Yamasaki L, Kinoshita R, Suzuki M, Nishiura H. 2022. Age-dependent effects of COVID-19 vaccine and of healthcare burden on COVID-19 deaths, Tokyo, Japan. Emerging Infect. Dis. 28, 1777–1784. ( 10.3201/eid2809.220377) - DOI - PMC - PubMed
    1. Prunas O, Warren JL, Crawford FW, Gazit S, Patalon T, Weinberger DM, Pitzer VE. 2022. Vaccination with BNT162b2 reduces transmission of SARS-CoV-2 to household contacts in Israel. Science 375, 1151–1154. ( 10.1126/science.abl4292) - DOI - PMC - PubMed
    1. Harris RJ, Hall JA, Zaidi A, Andrews NJ, Dunbar JK, Dabrera G. 2021. Effect of vaccination on household transmission of SARS-CoV-2 in England. N. Engl. J. Med. 385, 759–760. ( 10.1056/NEJMc2107717) - DOI - PMC - PubMed

Supplementary concepts