Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 16;13(12):2084.
doi: 10.3390/diagnostics13122084.

Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India

Affiliations

Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India

Abhishek Vyas et al. Diagnostics (Basel). .

Abstract

This paper discusses the importance of investigating DR using machine learning and a computational method to rank DR risk factors by importance using different machine learning models. The dataset was collected from four large population-based studies conducted in India between 2001 and 2010 on the prevalence of DR and its risk factors. We deployed different machine learning models on the dataset to rank the importance of the variables (risk factors). The study uses a t-test and Shapely additive explanations (SHAP) to rank the risk factors. Then, it uses five machine learning models (K-Nearest Neighbor, Decision Tree, Support Vector Machines, Logistic Regression, and Naive Bayes) to identify the unimportant risk factors based on the area under the curve criterion to predict DR. To determine the overall significance of risk variables, a weighted average of each classifier's importance is used. The ranking of risk variables is provided to machine learning models. To construct a model for DR prediction, the combination of risk factors with the highest AUC is chosen. The results show that the risk factors glycosylated hemoglobin and systolic blood pressure were present in the top three risk factors for DR in all five machine learning models when the t-test was used for ranking. Furthermore, the risk factors, namely, systolic blood pressure and history of hypertension, were present in the top five risk factors for DR in all the machine learning models when SHAP was used for ranking. Finally, when an ensemble of the five machine learning models was employed, independently with both the t-test and SHAP, systolic blood pressure and diabetes mellitus duration were present in the top four risk factors for diabetic retinopathy. Decision Tree and K-Nearest Neighbor resulted in the highest AUCs of 0.79 (t-test) and 0.77 (SHAP). Moreover, K-Nearest Neighbor predicted DR with 82.6% (t-test) and 78.3% (SHAP) accuracy.

Keywords: diabetic retinopathy; machine learning; ranking; risk factors.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
The flow of the method design.
Figure 2
Figure 2
(a,b,d,e) AUC and accuracy for the Decision tree and K-Nearest Neighbors in the case of t-test and SHAP. (c,f) ROC curve for all models of t-test and SHAP.

Similar articles

Cited by

References

    1. King H., Aubert R.E., Herman W.H. Global Burden of Diabetes, 1995–2025 Prevalence, numerical estimates, and projections. Diabetes Care. 1998;21:1414–1431. doi: 10.2337/diacare.21.9.1414. - DOI - PubMed
    1. Anjana R.M., Deepa M., Pradeepa R., Mahanta J., Narain K., Das H.K., Adhikari P., Rao P.V., Saboo B., Kumar A., et al. Prevalence of diabetes and prediabetes in 15 states of India: Results from the ICMR--INDIAB population-based cross-sectional study. Lancet Diabetes Endocrinol. 2017;5:585–596. doi: 10.1016/S2213-8587(17)30174-2. - DOI - PubMed
    1. Saeedi P., Petersohn I., Salpea P., Malanda B., Karuranga S., Unwin N., Colagiuri S., Guariguata L., Motala A.A., Ogurtsova K., et al. Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the International Diabetes Federation Diabetes Atlas. Diabetes Res. Clin. Pract. 2019;157:107843. doi: 10.1016/j.diabres.2019.107843. - DOI - PubMed
    1. Whiting D.R., Guariguata L., Weil C., Shaw J. IDF diabetes atlas: Global estimates of the prevalence of diabetes for 2011 and 2030. Diabetes Res. Clin. Pract. 2011;94:311–321. doi: 10.1016/j.diabres.2011.10.029. - DOI - PubMed
    1. Anjana R.M., Pradeepa R., Deepa M., Datta M., Sudha V., Unnikrishnan R., Bhansali A., Joshi S.R., Joshi P.P., Yajnik C.S., et al. Prevalence of diabetes and prediabetes (impaired fasting glucose and/or impaired glucose tolerance) in urban and rural India: Phase I results of the Indian Council of Medical Research—INdia DIABetes (ICMR—INDIAB) study. Diabetologia. 2011;54:3022–3027. doi: 10.1007/s00125-011-2291-5. - DOI - PubMed

LinkOut - more resources