Diabetes prediction using machine learning and explainable AI techniques
- PMID: 37077883
- PMCID: PMC10107388
- DOI: 10.1049/htl2.12039
Diabetes prediction using machine learning and explainable AI techniques
Abstract
Globally, diabetes affects 537 million people, making it the deadliest and the most common non-communicable disease. Many factors can cause a person to get affected by diabetes, like excessive body weight, abnormal cholesterol level, family history, physical inactivity, bad food habit etc. Increased urination is one of the most common symptoms of this disease. People with diabetes for a long time can get several complications like heart disorder, kidney disease, nerve damage, diabetic retinopathy etc. But its risk can be reduced if it is predicted early. In this paper, an automatic diabetes prediction system has been developed using a private dataset of female patients in Bangladesh and various machine learning techniques. The authors used the Pima Indian diabetes dataset and collected additional samples from 203 individuals from a local textile factory in Bangladesh. Feature selection algorithm mutual information has been applied in this work. A semi-supervised model with extreme gradient boosting has been utilized to predict the insulin features of the private dataset. SMOTE and ADASYN approaches have been employed to manage the class imbalance problem. The authors used machine learning classification methods, that is, decision tree, SVM, Random Forest, Logistic Regression, KNN, and various ensemble techniques, to determine which algorithm produces the best prediction results. After training on and testing all the classification models, the proposed system provided the best result in the XGBoost classifier with the ADASYN approach with 81% accuracy, 0.81 F1 coefficient and AUC of 0.84. Furthermore, the domain adaptation method has been implemented to demonstrate the versatility of the proposed system. The explainable AI approach with LIME and SHAP frameworks is implemented to understand how the model predicts the final results. Finally, a website framework and an Android smartphone application have been developed to input various features and predict diabetes instantaneously. The private dataset of female Bangladeshi patients and programming codes are available at the following link: https://github.com/tansin-nabil/Diabetes-Prediction-Using-Machine-Learning.
Keywords: AdaBoost; K‐nearest neighbour; android Application; decision tree; diabetes; random forest; support vector machine.
© 2022 The Authors. Healthcare Technology Letters published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology.
Conflict of interest statement
The authors declare no conflict of interest.
Figures













Similar articles
-
A proposed technique for predicting heart disease using machine learning algorithms and an explainable AI method.Sci Rep. 2024 Oct 7;14(1):23277. doi: 10.1038/s41598-024-74656-2. Sci Rep. 2024. PMID: 39375427 Free PMC article.
-
Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence.Diagnostics (Basel). 2023 Apr 21;13(8):1506. doi: 10.3390/diagnostics13081506. Diagnostics (Basel). 2023. PMID: 37189606 Free PMC article.
-
Performance analysis and prediction of type 2 diabetes mellitus based on lifestyle data using machine learning approaches.J Diabetes Metab Disord. 2022 Mar 14;21(1):339-352. doi: 10.1007/s40200-022-00981-w. eCollection 2022 Jun. J Diabetes Metab Disord. 2022. PMID: 35673418 Free PMC article.
-
Artificial intelligence in clinical care amidst COVID-19 pandemic: A systematic review.Comput Struct Biotechnol J. 2021;19:2833-2850. doi: 10.1016/j.csbj.2021.05.010. Epub 2021 May 7. Comput Struct Biotechnol J. 2021. PMID: 34025952 Free PMC article. Review.
-
Hybrid feature selection and classification technique for early prediction and severity of diabetes type 2.PLoS One. 2024 Jan 18;19(1):e0292100. doi: 10.1371/journal.pone.0292100. eCollection 2024. PLoS One. 2024. PMID: 38236900 Free PMC article. Review.
Cited by
-
Analyzing classification and feature selection strategies for diabetes prediction across diverse diabetes datasets.Front Artif Intell. 2024 Aug 21;7:1421751. doi: 10.3389/frai.2024.1421751. eCollection 2024. Front Artif Intell. 2024. PMID: 39233892 Free PMC article.
-
A Mobile App That Addresses Interpretability Challenges in Machine Learning-Based Diabetes Predictions: Survey-Based User Study.JMIR Form Res. 2023 Nov 13;7:e50328. doi: 10.2196/50328. JMIR Form Res. 2023. PMID: 37955948 Free PMC article.
-
Machine Learning Meets Meta-Heuristics: Bald Eagle Search Optimization and Red Deer Optimization for Feature Selection in Type II Diabetes Diagnosis.Bioengineering (Basel). 2024 Jul 29;11(8):766. doi: 10.3390/bioengineering11080766. Bioengineering (Basel). 2024. PMID: 39199724 Free PMC article.
-
Leveraging Shapley Additive Explanations for Feature Selection in Ensemble Models for Diabetes Prediction.Bioengineering (Basel). 2024 Nov 30;11(12):1215. doi: 10.3390/bioengineering11121215. Bioengineering (Basel). 2024. PMID: 39768033 Free PMC article.
-
Evaluating Inflammatory Bowel Disease-Related Quality of Life Using an Interpretable Machine Learning Approach: A Multicenter Study in China.J Inflamm Res. 2024 Aug 9;17:5271-5283. doi: 10.2147/JIR.S470197. eCollection 2024. J Inflamm Res. 2024. PMID: 39139580 Free PMC article.
References
-
- Atlas, G. : Diabetes. International Diabetes Federation. 10th ed., IDF Diabetes Atlas.
-
- Prabhu, P. , Selvabharathi, S. : Deep belief neural network model for prediction of diabetes mellitus. In: International Conference on Imaging, Signal Processing and Communication, pp. 138–142 (2019)
-
- VijiyaKumar, K. , Lavanya, B. , Nirmala, I. , Caroline, S.S. : Random forest algorithm for the prediction of diabetes. In: International Conference on System, Computation, Automation and Networking, pp. 1–5 (2019)
-
- Mohan, N. , Jain, V. : Performance analysis of support vector machine in diabetes prediction. In: International Conference on Electronics, Communication and Aerospace Technology, pp. 1–3 (2020)
LinkOut - more resources
Full Text Sources
Research Materials