Machine learning-based identification of wastewater treatment plant-specific microbial indicators using 16S rRNA gene sequencing
- PMID: 40610607
- PMCID: PMC12226725
- DOI: 10.1038/s41598-025-07952-0
Machine learning-based identification of wastewater treatment plant-specific microbial indicators using 16S rRNA gene sequencing
Abstract
Effluent released from municipal wastewater treatment plants reflects the microbial communities responsible for degrading and removing contaminants within the plants. Monitoring this effluent offers essential insights into its environmental impacts, the efficiency of treatment processes, and the presence of emerging contaminants. To support improved monitoring and source attribution, our study employed a machine-learning framework to identify microbial indicators capable of distinguishing between municipal treatment plants based on effluent microbiota. We collected 57 effluent samples for sequencing of the V4 region of the 16S rRNA gene from six treatment plants in the Pirkanmaa region in Finland between 2016 and 2018. Characterising the microbiome revealed that although each plant had unique microbial profiles, their overall diversity and richness were similar. This provided a robust foundation for identifying plant-specific microbes. Using ANOVA-F for feature selection, we focused on the genus level due to its informative prevalence. Among various models tested, the Gaussian Naive Bayes model yielded the highest accuracy with the fewest relevant microbes. We identified nine bacterial genera and one archaeon, whose relative abundances predicted the origin of the effluent with 92% accuracy. Our study outlines a framework for the cost-effective and rapid identification of the origin of effluent or changes in the treatment process, demonstrating the power of machine learning in environmental monitoring and management.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Competing interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. Kirsi-Maarit Lehto, Heikki Hyöty and Sami Oikarinen are the stakeholders of GreenSeq Ltd. Finland.
Figures




References
-
- Cai, L., Ju, F. & Zhang, T. Tracking human sewage microbiome in a municipal wastewater treatment plant. Appl. Microbiol. Biotechnol.98(7), 3317–3326 (2014). - PubMed
-
- Lee, S. H., Kang, H. J. & Park, H. D. Influence of influent wastewater communities on temporal variation of activated sludge communities. Water Res.73, 132–144 (2015). - PubMed
MeSH terms
Substances
Grants and funding
- Tampere3 Innovation Competition (2017 - 2018)/City of Tampere, Finland
- Tampere3 Innovation Competition (2017 - 2018)/City of Tampere, Finland
- Tampere3 Innovation Competition (2017 - 2018)/City of Tampere, Finland
- Tampere3 Innovation Competition (2017 - 2018)/City of Tampere, Finland
- Tampere3 Innovation Competition (2017 - 2018)/City of Tampere, Finland
LinkOut - more resources
Full Text Sources