Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 9:8:773881.
doi: 10.3389/fmed.2021.773881. eCollection 2021.

Usefulness of Machine Learning for Identification of Referable Diabetic Retinopathy in a Large-Scale Population-Based Study

Affiliations

Usefulness of Machine Learning for Identification of Referable Diabetic Retinopathy in a Large-Scale Population-Based Study

Cheng Yang et al. Front Med (Lausanne). .

Abstract

Purpose: To development and validation of machine learning-based classifiers based on simple non-ocular metrics for detecting referable diabetic retinopathy (RDR) in a large-scale Chinese population-based survey. Methods: The 1,418 patients with diabetes mellitus from 8,952 rural residents screened in the population-based Dongguan Eye Study were used for model development and validation. Eight algorithms [extreme gradient boosting (XGBoost), random forest, naïve Bayes, k-nearest neighbor (KNN), AdaBoost, Light GBM, artificial neural network (ANN), and logistic regression] were used for modeling to detect RDR in individuals with diabetes. The area under the receiver operating characteristic curve (AUC) and their 95% confidential interval (95% CI) were estimated using five-fold cross-validation as well as an 80:20 ratio of training and validation. Results: The 10 most important features in machine learning models were duration of diabetes, HbA1c, systolic blood pressure, triglyceride, body mass index, serum creatine, age, educational level, duration of hypertension, and income level. Based on these top 10 variables, the XGBoost model achieved the best discriminative performance, with an AUC of 0.816 (95%CI: 0.812, 0.820). The AUCs for logistic regression, AdaBoost, naïve Bayes, and Random forest were 0.766 (95%CI: 0.756, 0.776), 0.754 (95%CI: 0.744, 0.764), 0.753 (95%CI: 0.743, 0.763), and 0.705 (95%CI: 0.697, 0.713), respectively. Conclusions: A machine learning-based classifier that used 10 easily obtained non-ocular variables was able to effectively detect RDR patients. The importance scores of the variables provide insight to prevent the occurrence of RDR. Screening RDR with machine learning provides a useful complementary tool for clinical practice in resource-poor areas with limited ophthalmic infrastructure.

Keywords: XGBoost; classifier; diabetic retinopathy; machine learning; population-based study.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Machine learning flowchart of this study. ML, machine learning; XGBoost, extreme gradient boosting; ANN, artificial neural network; AdaBoost, adaptive boosting; GBM, gradient boosting machine.
Figure 2
Figure 2
Feature importance contributed to each machine learning model. (A) XGBoost. (B) Random forest. (C) Naïve Bayes. (D) KNN.
Figure 3
Figure 3
Venn plot showing the most important features in each model for detecting referable diabetic retinopathy.
Figure 4
Figure 4
Receiver operating characteristic curves of five algorithms for detecting referable diabetic retinopathy based on top-10 important variables.

Similar articles

Cited by

References

    1. Bommer C, Sagalova V, Heesemann E, Manne-Goehler J, Atun R, Barnighausen T, et al. . Global economic burden of diabetes in adults: projections from 2015 to 2030. Diabetes Care. (2018) 41:963-70. 10.2337/dc17-1962 - DOI - PubMed
    1. Teo ZL, Tham YC, Yu M, Chee ML, Rim TH, Cheung N, et al. . Global prevalence of diabetic retinopathy and projection of burden through 2045: systematic review and meta-analysis. Ophthalmology. (2021) 2021:S161-6420. 10.1016/j.ophtha.2021.04.027 - DOI - PubMed
    1. Benoit SR, Swenor B, Geiss LS, Gregg EW, Saaddine JB. Eye care utilization among insured people with diabetes in the U.S., 2010-2014. Diabetes Care. (2019) 42:427-33. 10.2337/dc18-0828 - DOI - PubMed
    1. Eppley SE, Mansberger SL, Ramanathan S, Lowry EA. Characteristics associated with adherence to annual dilated eye examinations among US patients with diagnosed diabetes. Ophthalmology. (2019) 126:1492-9. 10.1016/j.ophtha.2019.05.033 - DOI - PubMed
    1. Taylor-Phillips S, Mistry H, Leslie R, Todkill D, Tsertsvadze A, Connock M, et al. . Extending the diabetic retinopathy screening interval beyond 1 year: systematic review. Br J Ophthalmol. (2016) 100:105-14. 10.1136/bjophthalmol-2014-305938 - DOI - PMC - PubMed

LinkOut - more resources