Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2026 Jan 1;288(Pt 1):123229.
doi: 10.1016/j.envres.2025.123229. Epub 2025 Oct 27.

Ensemble learning-driven optimization of coagulant dosing for drinking water treatment plants using a scalable framework for smart and sustainable process control

Affiliations
Free article

Ensemble learning-driven optimization of coagulant dosing for drinking water treatment plants using a scalable framework for smart and sustainable process control

Abderzak Moussouni et al. Environ Res. .
Free article

Abstract

Coagulation and flocculation remain foundational in drinking water treatment plants (DWTPs) worldwide. Determining optimal coagulant dosage remains a persistent challenge due to raw water variability and reliance on conventional jar tests, labor-intensive, time-consuming methods, and being prone to operational inefficiencies. This study introduces a novel machine learning (ML) based framework leveraging tree-based ensemble models to predict coagulant dosing with high precision, offering a transformative alternative to empirical approaches. Seven algorithms, Random Forest (RF), ExtraTree, REPTree, and M5P (M5 Prime) Tree, RF-Extra Tree, RF-REPTree, RF-M5P Tree, were systematically evaluated using real-world operational data, incorporating five key water quality parameters: turbidity (Tb), Electric conductivity (EC), pH, temperature, and dissolved oxygen (DO). The performance of the proposed models was assessed based on Root Mean Square Error (RMSE), Nash-Sutcliffe Efficiency (NSE), Kling-Gupta Efficiency (KGE), Willmott's index of agreement (WI), and the coefficient of determination (R2) values. Among the models, the RF-Extra Tree model exhibited superior predictability (RMSE = 0.515, MAE = 0.329, NSE = 0.9850, WI = 0.996, KGE = 0.969 and R2 = 0.985), significantly outperforming traditional heuristics and other models. The RF algorithm also demonstrated robust results (RMSE = 0.807, MAE = 0.586, NSE = 0.963, WI = 0.99, KGE = 0.899, and R2 = 0.963), highlighting its potential for deployment in dynamic, data-driven treatment environments. This research not only underscores the capacity of ensemble learning to model the complex, non-linear relationships inherent in water treatment but also provides a scalable decision-support tool capable of enhancing treatment consistency, reducing chemical use, and optimizing operational efficiency across diverse geographic and climatic conditions. The proposed methodology holds global relevance for advancing smart water treatment infrastructure, offering both environmental and economic benefits to utilities and stakeholders.

Keywords: Coagulant dosage; Drinking water treatment plants (DWTPs); Ensemble models; Machine learning (ML); Water quality.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

LinkOut - more resources