Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct 14;18(20):10778.
doi: 10.3390/ijerph182010778.

Quest for Optimal Regression Models in SARS-CoV-2 Wastewater Based Epidemiology

Affiliations

Quest for Optimal Regression Models in SARS-CoV-2 Wastewater Based Epidemiology

Parisa Aberi et al. Int J Environ Res Public Health. .

Abstract

Wastewater-based epidemiology is a recognised source of information for pandemic management. In this study, we investigated the correlation between a SARS-CoV-2 signal derived from wastewater sampling and COVID-19 incidence values monitored by means of individual testing programs. The dataset used in the study is composed of timelines (duration approx. five months) of both signals at four wastewater treatment plants across Austria, two of which drain large communities and the other two drain smaller communities. Eight regression models were investigated to predict the viral incidence under varying data inputs and pre-processing methods. It was found that population-based normalisation and smoothing as a pre-processing of the viral load data significantly influence the fitness of the regression models. Moreover, the time latency lag between the wastewater data and the incidence derived from the testing program was found to vary between 2 and 7 days depending on the time period and site. It was found to be necessary to take such a time lag into account by means of multivariate modelling to boost the performance of the regression. Comparing the models, no outstanding one could be identified as all investigated models are revealing a sufficient correlation for the task. The pre-processing of data and a multivariate model formulation is more important than the model structure.

Keywords: SARS-CoV-2; Taylor diagram; incidence; multivariate model; regression; wastewater-based epidemiology.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Figures

Figure A1
Figure A1
Model performance as spider chart—WWTP A. (a) RMSE, and (b) RSQ metric comparing signal (S) only with combined information of signal (S), normalisation (N) and tests taken (T).
Figure 1
Figure 1
Modelling workflow.
Figure 2
Figure 2
Raw data timeline of SARS-CoV-2 gene copy numbers (copies/mL) and epidemiological timelines at the four sampling sites, where (AD) corresponds to WWTPs A–D.
Figure 3
Figure 3
Cross-correlation (CC) plots between SARS-CoV-2 signals and active cases in WWTPs (AD). The negative and positive lags correspond to the forward and backward lags between the incidence time series and viral load time series. The solid dashed lines indicate the significant levels at 95% confidence.
Figure 4
Figure 4
Correlation matrix for model regressors. Solid circles indicate Pearson’s correlation coefficients for multivariate models, and the dashed circle shows the same metric for a univariate model. Notations: A, active cases; X, SARS-CoV-2 load; N, normalised signal; S, smoothed signal; T, number of tests taken; and (number), a signal with number time step delay.
Figure 5
Figure 5
Evaluation of regression models ((a) KNN, (b) polynomial, (c) SVR, (d) MLP), for different ranges of parameters using the mean and variance of RMSE for the test data—WWTP A.
Figure 5
Figure 5
Evaluation of regression models ((a) KNN, (b) polynomial, (c) SVR, (d) MLP), for different ranges of parameters using the mean and variance of RMSE for the test data—WWTP A.
Figure 6
Figure 6
Visualisation of the number of active cases recorded versus model prediction in WWTP A: (A,B) model predictions against recorded data for training subset under univariate and multivariate inputs, respectively, and (C,D) the same plots for the testing subset.
Figure 7
Figure 7
Taylor diagram, displaying statistical comparison of the eight model predictions against the actual number of recorded active cases—WWTP A.
Figure 8
Figure 8
Data timeline of smoothed and normalized SARS-CoV-2 titer values (Megacopies/cap/d) versus active cases in observed/predicted timelines for the best models in WWTPs (AD).

References

    1. Metcalf T.G., Melnick J.L., Estes M.K. Environmental Virology: From Detection of Virus in Sewage and Water by Isolation to Identification by Molecular Biology—A Trip of over 50 Years. Annu. Rev. Microbiol. 1995;49:461–487. doi: 10.1146/annurev.mi.49.100195.002333. - DOI - PubMed
    1. Kittigul L., Raengsakulrach B., Siritantikorn S., Kanyok R., Utrarachkij F., Diraphat P., Thirawuth V., Siripanichgon K., Pungchitton S., Chitpirom K., et al. Detection of Poliovirus, Hepatitis A Virus and Rotavirus from Sewage and Water Samples. Southeast Asian J. Trop. Med. Public Health. 2000;31:41–46. - PubMed
    1. Medema G., Heijnen L., Elsinga G., Italiaander R., Brouwer A. Presence of SARS-Coronavirus-2 RNA in Sewage and Correlation with Reported COVID-19 Prevalence in the Early Stage of the Epidemic in the Netherlands. Environ. Sci. Technol. Lett. 2020;7:511–516. doi: 10.1021/acs.estlett.0c00357. - DOI - PubMed
    1. Heijnen L., Medema G. Surveillance of Influenza A and the Pandemic Influenza A (H1N1) 2009 in Sewage and Surface Water in the Netherlands. J. Water Health. 2011;9:434–442. doi: 10.2166/wh.2011.019. - DOI - PubMed
    1. Prado T., Fumian T.M., Mannarino C.F., Resende P.C., Motta F.C., Eppinghaus A.L.F., Chagas do Vale V.H., Braz R.M.S., de Andrade J.D.S.R., Maranhão A.G., et al. Wastewater-Based Epidemiology as a Useful Tool to Track SARS-CoV-2 and Support Public Health Policies at Municipal Level in Brazil. Water Res. 2021;191:116810. doi: 10.1016/j.watres.2021.116810. - DOI - PMC - PubMed

Publication types