Interpretable machine learning analysis of environmental characteristics on bacillary dysentery in Sichuan Province
- PMID: 40740360
- PMCID: PMC12307499
- DOI: 10.3389/fpubh.2025.1598247
Interpretable machine learning analysis of environmental characteristics on bacillary dysentery in Sichuan Province
Abstract
Background: Bacterial dysentery (BD) is a leading cause of diarrhea-related mortality globally, with its incidence heavily influenced by environmental factors. However, a climate zone-specific predictive model for BD was currently lacking in Sichuan Province.
Objective: This study aims to employ interpretable machine learning to explore the influence of environmental factors on BD incidence across different climate zones and to elucidate their interaction mechanisms.
Methods: Monthly data on meteorological and ecological factors, along with BD case reports, were collected from 183 counties in Sichuan Province (2005-2023). The eXtreme Gradient Boosting (XGBoost) algorithm was employed to assess the influence of key environmental features, including precipitation, temperature, PM10, potential evaporation, vegetation cover, and NDVI, on BD incidence. To enhance interpretability, the model's outputs were visualized and explained using SHapley Additive Explanations (SHAP).
Results: A machine learning model was developed to assess the impact of environmental factors on BD incidence across different climate zones. The findings revealed significant spatial heterogeneity in key drivers of BD. In the Central Subtropical Humid Climate Zone, BD incidence was predominantly influenced by average temperature, PM10, and minimum temperature. In the Subtropical Semi-Humid Climate Zone, potential evaporation, PM10, and precipitation emerged as the primary determinants. In the Plateau Cold Climate Zone, PM10, minimum temperature, and precipitation were the most significant factors. Notably, PM10 consistently showed a positive correlation with BD across all climate zones. Furthermore, average temperature showed a positive association with BD in the Central Subtropical Humid Climate Zone, while potential evaporation and minimum temperature demonstrated similar positive relationships in the Subtropical Semi-Humid and Plateau Cold Climate Zones, respectively. Additionally, precipitation displayed a U-shaped relationship with BD risk in both the Subtropical Semi-Humid and Plateau Cold Climate Zones.
Conclusion: This study developed a climate zone-specific predictive model for BD, systematically evaluating the interactions between environmental factors and BD dynamics. The findings provide a scientific basis for refining targeted public health intervention strategies.
Keywords: SHAP; XGBoost; bacterial dysentery; climate zones; environmental characteristics.
Copyright © 2025 Zhang, Wang, Peng, Zhang, Qin, Zhang, Wei and Kang.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures
References
-
- Khalil IA, Troeger C, Blacker BF, Rao PC, Brown A, Atherly DE, et al. Morbidity and mortality due to shigella and enterotoxigenic escherichia coli diarrhoea: the global burden of disease study 1990-2016. Lancet Infect Dis. (2018) 18:1229–40. doi: 10.1016/S1473-3099(18)30475-4, PMID: - DOI - PMC - PubMed
-
- Troeger C, Forouzanfar M, Rao PC, Khalil I, Brown A, Reiner RC, et al. Estimates of global, regional, and national morbidity, mortality, and aetiologies of diarrhoeal diseases: a systematic analysis for the global burden of disease study 2015. Lancet Infect Dis. (2017) 17:909–48. doi: 10.1016/S1473-3099(17)30276-1, PMID: - DOI - PMC - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources
