Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India
- PMID: 37370980
- PMCID: PMC10297706
- DOI: 10.3390/diagnostics13122084
Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India
Abstract
This paper discusses the importance of investigating DR using machine learning and a computational method to rank DR risk factors by importance using different machine learning models. The dataset was collected from four large population-based studies conducted in India between 2001 and 2010 on the prevalence of DR and its risk factors. We deployed different machine learning models on the dataset to rank the importance of the variables (risk factors). The study uses a t-test and Shapely additive explanations (SHAP) to rank the risk factors. Then, it uses five machine learning models (K-Nearest Neighbor, Decision Tree, Support Vector Machines, Logistic Regression, and Naive Bayes) to identify the unimportant risk factors based on the area under the curve criterion to predict DR. To determine the overall significance of risk variables, a weighted average of each classifier's importance is used. The ranking of risk variables is provided to machine learning models. To construct a model for DR prediction, the combination of risk factors with the highest AUC is chosen. The results show that the risk factors glycosylated hemoglobin and systolic blood pressure were present in the top three risk factors for DR in all five machine learning models when the t-test was used for ranking. Furthermore, the risk factors, namely, systolic blood pressure and history of hypertension, were present in the top five risk factors for DR in all the machine learning models when SHAP was used for ranking. Finally, when an ensemble of the five machine learning models was employed, independently with both the t-test and SHAP, systolic blood pressure and diabetes mellitus duration were present in the top four risk factors for diabetic retinopathy. Decision Tree and K-Nearest Neighbor resulted in the highest AUCs of 0.79 (t-test) and 0.77 (SHAP). Moreover, K-Nearest Neighbor predicted DR with 82.6% (t-test) and 78.3% (SHAP) accuracy.
Keywords: diabetic retinopathy; machine learning; ranking; risk factors.
Conflict of interest statement
The authors declare no conflict of interest.
Figures
Similar articles
-
Explainable machine learning model for predicting the risk of significant liver fibrosis in patients with diabetic retinopathy.BMC Med Inform Decis Mak. 2024 Nov 11;24(1):332. doi: 10.1186/s12911-024-02749-z. BMC Med Inform Decis Mak. 2024. PMID: 39529110 Free PMC article.
-
An enhanced machine learning algorithm for type 2 diabetes prognosis with a detailed examination of Key correlates.Sci Rep. 2024 Nov 1;14(1):26355. doi: 10.1038/s41598-024-75898-w. Sci Rep. 2024. PMID: 39487189 Free PMC article.
-
Population-Based Artificial Intelligence Assessment of Relationship Between the Risk Factors for Diabetic Retinopathy in Indian Population.Ophthalmic Epidemiol. 2024 Oct;31(5):393-399. doi: 10.1080/09286586.2023.2285971. Epub 2023 Dec 12. Ophthalmic Epidemiol. 2024. PMID: 38085807
-
A machine learning-based prediction of diabetic retinopathy using the Korea national health and nutrition examination survey (2008-2012, 2017-2021).Front Med (Lausanne). 2025 May 30;12:1542860. doi: 10.3389/fmed.2025.1542860. eCollection 2025. Front Med (Lausanne). 2025. PMID: 40520799 Free PMC article.
-
Predicting the risk of diabetic retinopathy using explainable machine learning algorithms.Diabetes Metab Syndr. 2023 Dec;17(12):102919. doi: 10.1016/j.dsx.2023.102919. Epub 2023 Dec 4. Diabetes Metab Syndr. 2023. PMID: 38091881
Cited by
-
The causal effect of hypertension, intraocular pressure, and diabetic retinopathy: a Mendelian randomization study.Front Endocrinol (Lausanne). 2024 Feb 6;15:1304512. doi: 10.3389/fendo.2024.1304512. eCollection 2024. Front Endocrinol (Lausanne). 2024. PMID: 38379860 Free PMC article.
-
Predictive model and risk analysis for peripheral vascular disease in type 2 diabetes mellitus patients using machine learning and shapley additive explanation.Front Endocrinol (Lausanne). 2024 Feb 28;15:1320335. doi: 10.3389/fendo.2024.1320335. eCollection 2024. Front Endocrinol (Lausanne). 2024. PMID: 38481447 Free PMC article.
References
-
- Anjana R.M., Deepa M., Pradeepa R., Mahanta J., Narain K., Das H.K., Adhikari P., Rao P.V., Saboo B., Kumar A., et al. Prevalence of diabetes and prediabetes in 15 states of India: Results from the ICMR--INDIAB population-based cross-sectional study. Lancet Diabetes Endocrinol. 2017;5:585–596. doi: 10.1016/S2213-8587(17)30174-2. - DOI - PubMed
-
- Saeedi P., Petersohn I., Salpea P., Malanda B., Karuranga S., Unwin N., Colagiuri S., Guariguata L., Motala A.A., Ogurtsova K., et al. Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the International Diabetes Federation Diabetes Atlas. Diabetes Res. Clin. Pract. 2019;157:107843. doi: 10.1016/j.diabres.2019.107843. - DOI - PubMed
-
- Anjana R.M., Pradeepa R., Deepa M., Datta M., Sudha V., Unnikrishnan R., Bhansali A., Joshi S.R., Joshi P.P., Yajnik C.S., et al. Prevalence of diabetes and prediabetes (impaired fasting glucose and/or impaired glucose tolerance) in urban and rural India: Phase I results of the Indian Council of Medical Research—INdia DIABetes (ICMR—INDIAB) study. Diabetologia. 2011;54:3022–3027. doi: 10.1007/s00125-011-2291-5. - DOI - PubMed
LinkOut - more resources
Full Text Sources