An ethnic-sensitive hybrid framework for T2D prediction with explainable AI and weighted ensembles
- PMID: 41436827
- PMCID: PMC12800193
- DOI: 10.1038/s41598-025-31234-4
An ethnic-sensitive hybrid framework for T2D prediction with explainable AI and weighted ensembles
Abstract
Type 2 diabetes (T2D) is a growing global health crisis, affecting over 537 million people as of 2021. Early prediction remains particularly challenging in low- and middle-income countries due to missing data, class imbalance, and population-specific risk factors. This study presents a four-stage predictive framework- Feature-Weighted Class-Adaptive Generative Imputation Network-Weighted Classifier Aggregation Ensemble (FW-CAGIN-WCAE)-designed to address these limitations. First, Zero-Threshold Feature Removal (ZTFR) is applied to eliminate low-quality variables. Second, missing values are imputed FW-CAGIN, a novel class-aware and feature-weighted GAN model that accounts for both class and feature importance. Third, a performance-weighted ensemble of 15 machine and deep learning algorithms is constructed. Finally, SHAP analysis is used to uncover population-specific risk indicators. The proposed method was evaluated on three benchmark datasets-PIDD, FHGDD, and BDD-and their combinations, using nested five-fold cross-validation. The model achieved a peak AUC of 0.936 ± 0.018 in PIDD-BDD combination and reduced the imputation mean absolute error (MAE) from 0.8028 to 0.0033. It also lowered AUC variability by 36.3% and improved the diagnostic odds ratio (DOR) to 68.4 ± 20.5. SHAP analysis identified as a key predictive feature across both Asian and European populations. These findings demonstrate that the proposed framework offers an accurate, interpretable, and population-sensitive solution for early T2D detection, especially in resource-limited healthcare settings.
Keywords: Ensemble learning; FW-CAGIN imputation; Feature selection; Population-specific risk; SHAP analysis; Type 2 diabetes prediction.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.
Figures
References
-
- Forray, A.-I. et al. The global burden of disease: A focus on type II diabetes. In Handbook of Public Health Nutrition: International, National, and Regional Perspectives 1–25 (Springer, 2025).
-
- Genitsaridi, I., et al., Idf Diabetes Atlas: Global, Regional and National Diabetes Prevalence Estimates for 2024 and Projections for 2050. - PubMed
-
- Althobaiti, T., Althobaiti, S. & Selim, M. M. An optimized diabetes mellitus detection model for improved prediction of accuracy and clinical decision-making. Alex. Eng. J.94, 311–324 (2024). - DOI
MeSH terms
LinkOut - more resources
Full Text Sources
Medical
