Water quality variables and spectral indices as predictors of E. coli concentrations in an irrigation pond: A case study
- PMID: 40848368
- DOI: 10.1016/j.watres.2025.124344
Water quality variables and spectral indices as predictors of E. coli concentrations in an irrigation pond: A case study
Abstract
Escherichia coli (E. coli) is a commonly used indicator of microbial water quality affecting public health and farm enterprise sustainability. Remote sensing has become an effective tool to overcome traditional water quality monitoring limitations. In this study, we applied the random forest (RF) machine learning algorithm to estimate E. coli concentrations in irrigation pond water during the summer season using a) 17 water quality variables, b) reflectance in five spectral bands, and c) 24 spectral indices derived from these reflectance values. The linear transform-based postprocessing was found beneficial. The RF model with water quality variables as inputs demonstrated good performance with an R2 of 0.736 and RMSE of 0.384 log(MPN/100 mL). While the accuracy of the RF model with five reflectance values as inputs was moderate (R2 = 0.562), the RF model using spectral indices had the highest testing R2 of 0.762 and the lowest RMSE of 0.380 log(MPN/100 mL). After training the RF models for each input dataset, we calculated the variable importance by applying out-of-bag (OOB) and Shapley additive explanations (SHAP). Dissolved oxygen, chlorophyll-a, pH, and fluorescent dissolved organic matter were the most important when modeling the E. coli concentrations using the water quality variables. The most important predictors in the case of using spectral indices were the visible atmospherically resistant index (VARI) and the normalized difference turbidity index (NDTI). Comparisons of variable importance between different sampling locations revealed that samples from interior and nearshore locations had different magnitudes and trends of influence of VARI and NDTI on E. coli concentrations. We hypothesized that the good predictive power of spectral indices can be explained by their capabilities to characterize the aspects of water quality important for E. coli survival. The results of this work demonstrate the feasibility and advantages of applying spectral indices derived from the UAV-based multispectral imagery for estimating E. coli concentrations in irrigation ponds.
Keywords: Escherichia coli; Machine learning; Multispectral indices; Remote sensing; Variable importance.
Copyright © 2025. Published by Elsevier Ltd.
Conflict of interest statement
Declaration of competing interest The authors declare no conflicts of interest.
MeSH terms
LinkOut - more resources
Full Text Sources
