Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 16:13:1598247.
doi: 10.3389/fpubh.2025.1598247. eCollection 2025.

Interpretable machine learning analysis of environmental characteristics on bacillary dysentery in Sichuan Province

Affiliations

Interpretable machine learning analysis of environmental characteristics on bacillary dysentery in Sichuan Province

Yao Zhang et al. Front Public Health. .

Abstract

Background: Bacterial dysentery (BD) is a leading cause of diarrhea-related mortality globally, with its incidence heavily influenced by environmental factors. However, a climate zone-specific predictive model for BD was currently lacking in Sichuan Province.

Objective: This study aims to employ interpretable machine learning to explore the influence of environmental factors on BD incidence across different climate zones and to elucidate their interaction mechanisms.

Methods: Monthly data on meteorological and ecological factors, along with BD case reports, were collected from 183 counties in Sichuan Province (2005-2023). The eXtreme Gradient Boosting (XGBoost) algorithm was employed to assess the influence of key environmental features, including precipitation, temperature, PM10, potential evaporation, vegetation cover, and NDVI, on BD incidence. To enhance interpretability, the model's outputs were visualized and explained using SHapley Additive Explanations (SHAP).

Results: A machine learning model was developed to assess the impact of environmental factors on BD incidence across different climate zones. The findings revealed significant spatial heterogeneity in key drivers of BD. In the Central Subtropical Humid Climate Zone, BD incidence was predominantly influenced by average temperature, PM10, and minimum temperature. In the Subtropical Semi-Humid Climate Zone, potential evaporation, PM10, and precipitation emerged as the primary determinants. In the Plateau Cold Climate Zone, PM10, minimum temperature, and precipitation were the most significant factors. Notably, PM10 consistently showed a positive correlation with BD across all climate zones. Furthermore, average temperature showed a positive association with BD in the Central Subtropical Humid Climate Zone, while potential evaporation and minimum temperature demonstrated similar positive relationships in the Subtropical Semi-Humid and Plateau Cold Climate Zones, respectively. Additionally, precipitation displayed a U-shaped relationship with BD risk in both the Subtropical Semi-Humid and Plateau Cold Climate Zones.

Conclusion: This study developed a climate zone-specific predictive model for BD, systematically evaluating the interactions between environmental factors and BD dynamics. The findings provide a scientific basis for refining targeted public health intervention strategies.

Keywords: SHAP; XGBoost; bacterial dysentery; climate zones; environmental characteristics.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Spatial distribution of climate zones and 183 county-level administrative divisions.
Figure 2
Figure 2
Time series distribution of bacillary dysentery (BD) cases and monthly case distribution by region: Overall (province-wide), Zone 1 (Central Subtropical Humid Climate Zone), Zone 2 (Subtropical Semi-Humid Climate Zone), and Zone 3 (Plateau Cold Climate Zone).
Figure 3
Figure 3
Distribution of SHAP values for environmental factors across regions: (A) Overall, (B) Zone 1, (C) Zone 2, and (D) Zone 3.
Figure 4
Figure 4
SHAP dependence plots for environmental features in the XGBoost model: Province-wide analysis.
Figure 5
Figure 5
SHAP dependence plots for environmental features in the XGBoost model: Central Subtropical Humid Climate Zone analysis.
Figure 6
Figure 6
SHAP dependence plots for environmental features in the XGBoost model: Subtropical Semi-Humid Climate Zone analysis.
Figure 7
Figure 7
SHAP dependence plots for environmental features in the XGBoost model: Plateau Cold Climate Zone analysis.

Similar articles

References

    1. Kotloff KL, Riddle MS, Platts-Mills JA, Pavlinac P, Zaidi A. Shigellosis. Lancet. (2018) 391:801–12. doi: 10.1016/S0140-6736(17)33296-8, PMID: - DOI - PubMed
    1. Kotloff KL. The burden and etiology of diarrheal illness in developing countries. Pediatr Clin N Am. (2017) 64:799–814. doi: 10.1016/j.pcl.2017.03.006, PMID: - DOI - PubMed
    1. Hosangadi D, Smith PG, Kaslow DC, Giersing BK. Who consultation on etec and shigella burden of disease, Geneva, 6-7th april 2017: meeting report. Vaccine. (2019) 37:7381–90. doi: 10.1016/j.vaccine.2017.10.011, PMID: - DOI - PubMed
    1. Khalil IA, Troeger C, Blacker BF, Rao PC, Brown A, Atherly DE, et al. Morbidity and mortality due to shigella and enterotoxigenic escherichia coli diarrhoea: the global burden of disease study 1990-2016. Lancet Infect Dis. (2018) 18:1229–40. doi: 10.1016/S1473-3099(18)30475-4, PMID: - DOI - PMC - PubMed
    1. Troeger C, Forouzanfar M, Rao PC, Khalil I, Brown A, Reiner RC, et al. Estimates of global, regional, and national morbidity, mortality, and aetiologies of diarrhoeal diseases: a systematic analysis for the global burden of disease study 2015. Lancet Infect Dis. (2017) 17:909–48. doi: 10.1016/S1473-3099(17)30276-1, PMID: - DOI - PMC - PubMed

LinkOut - more resources