Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov;214(Pt 1):113809.
doi: 10.1016/j.envres.2022.113809. Epub 2022 Jul 5.

Data modelling recipes for SARS-CoV-2 wastewater-based epidemiology

Affiliations

Data modelling recipes for SARS-CoV-2 wastewater-based epidemiology

Wolfgang Rauch et al. Environ Res. 2022 Nov.

Abstract

Wastewater based epidemiology is recognized as one of the monitoring pillars, providing essential information for pandemic management. Central in the methodology are data modelling concepts for both communicating the monitoring results but also for analysis of the signal. It is due to the fast development of the field that a range of modelling concepts are used but without a coherent framework. This paper provides for such a framework, focusing on robust and simple concepts readily applicable, rather than applying latest findings from e.g., machine learning. It is demonstrated that data preprocessing, most important normalization by means of biomarkers and equal temporal spacing of the scattered data, is crucial. In terms of the latter, downsampling to a weekly spaced series is sufficient. Also, data smoothing turned out to be essential, not only for communication of the signal dynamics but likewise for regressions, nowcasting and forecasting. Correlation of the signal with epidemic indicators requires multivariate regression as the signal alone cannot explain the dynamics but - for this case study - multiple linear regression proofed to be a suitable tool when the focus is on understanding and interpretation. It was also demonstrated that short term prediction (7 days) is accurate with simple models (exponential smoothing or autoregressive models) but forecast accuracy deteriorates fast for longer periods.

Keywords: Data modelling; Forecast; Regression; SARS-CoV-2; Smoothing; Wastewater-based epidemiology.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1
Fig. 1
Timeline of SARS-Cov-2 measurement (raw data in copies/ml) and daily documented new infections.
Fig. 2
Fig. 2
Left: Downsampling (block-wise averaging) and Interpolation (Shepard Filter) to derive an equally spaced weekly time series for the Vienna case study. Right: Upsampling by means of Interpolation (Shepard Filter) towards a daily series.
Fig. 3
Fig. 3
Smoothing of the Vienna time series. Comparison of SMA (weekly) and LOESS (daily) with measurements (Sample).
Fig. 4
Fig. 4
Weekly spaced - smoothed dataset (Type 3) for regression analysis of the Vienna case study.
Fig. 5
Fig. 5
Regression analysis of the Vienna case study – estimating incidence (y) with multivariate linear regression (ye).
Fig. 6
Fig. 6
Left: Box-Cox transformed Lvirus data series (daily values) and Right: Q-Q plot of the transformed series.
Fig. 7
Fig. 7
Post-sample analysis of autoregressive models. The consecutive 7-day forecasts are plotted against the observation for both the daily and weekly spaced series.
Fig. 8
Fig. 8
Regression analysis of the Vienna case study – estimating incidence (measured) with 3rd order Polynomial (POL), support vector machine (SVR) and multivariate linear regression (MLR). Data Type 3 (weekly – smoothed).
Fig. 9
Fig. 9
RMSE result of post-sample analysis for Simple Exponential Smoothing (SES) and Autoregressive Model (AR) for daily and weekly spaced time series. Both series are smoothed. Forecast horizon 7 days and 14 days. Data transformation: Differencing and BoxCox transformation plus differencing.

References

    1. Aberi P., Arabzadeh R., Insam H., Markt R., Mayr M., Kreuzinger N., Rauch W. Quest for optimal regression models in SARS-CoV-2 wastewater based epidemiology. Int. J. Environ. Res. Publ. Health. 2021;18(20) - PMC - PubMed
    1. Ahmed W., Angel N., Edson J., Bibby K., Bivins A., O'Brien J.W., Choi P.M., Kitajima M., Simpson S.L., Li J., Tscharke B. First confirmed detection of SARS-CoV-2 in untreated wastewater in Australia: a proof of concept for the wastewater surveillance of COVID-19 in the community. Sci. Total Environ. 2020;728 - PMC - PubMed
    1. Amman F., Markt R., Endler L., Agerer B., Schedl A., Richter L., Zechmeister M., Bicher M., Heiler G., Triska P., Thornton M., Penz Th, Senekowitsch M., Laine J., Keszei Z., Klimek P., Nägele F., Mayr M., Daleiden B., Steinlechner M., Niederstätter H., Heidinger P., Rauch W., Scheffknecht Ch, Vogl G., Weichlinger G., Wagner A., Slipko K., Masseron A., Radu E., Allerberger F., Popper N., Bock Ch, Schmid D., Oberacher H., Kreuzinger N., Insam H., Bergthaler A. Viral variant-resolved wastewater surveillance of SARS-CoV-2 at national scale. Nat. Biotechnol. 2022 doi: 10.1038/s41587-022-01387-y. In press. - DOI - PubMed
    1. Amoah I.D., Abunama T., Awolusi O.O., Pillay L., Pillay K., Kumari S., Bux F. Effect of selected wastewater characteristics on estimation of SARS-CoV-2 viral load in wastewater. Environ. Res. 2022;203 - PMC - PubMed
    1. Arabzadeh R., Gruenbacher D.M., Insam H., Kreuzinger N., Markt R., Rauch W. Data filtering methods for SARS-CoV-2 wastewater surveillance. Water Sci. Technol. 2021;84(6):1324–1339. (2021) - PubMed

Publication types