Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Apr;3(4):294-304.
doi: 10.1016/j.oret.2018.10.014. Epub 2018 Nov 3.

Deep Learning-Based Algorithms in Screening of Diabetic Retinopathy: A Systematic Review of Diagnostic Performance

Affiliations

Deep Learning-Based Algorithms in Screening of Diabetic Retinopathy: A Systematic Review of Diagnostic Performance

Katrine B Nielsen et al. Ophthalmol Retina. 2019 Apr.

Abstract

Topic: Diagnostic performance of deep learning-based algorithms in screening patients with diabetes for diabetic retinopathy (DR). The algorithms were compared with the current gold standard of classification by human specialists.

Clinical relevance: Because DR is a common cause of visual impairment, screening is indicated to avoid irreversible vision loss. Automated DR classification using deep learning may be a suitable new screening tool that could improve diagnostic performance and reduce manpower.

Methods: For this systematic review, we aimed to identify studies that incorporated the use of deep learning in classifying full-scale DR in retinal fundus images of patients with diabetes. The studies had to provide a DR grading scale, a human grader as a reference standard, and a deep learning performance score. A systematic search on April 5, 2018, through MEDLINE and Embase yielded 304 publications. To identify potentially missed publications, the reference lists of the final included studies were manually screened, yielding no additional publications. The Quality Assessment of Diagnostic Accuracy Studies 2 tool was used for risk of bias and applicability assessment.

Results: By using objective selection, we included 11 diagnostic accuracy studies that validated the performance of their deep learning method using a new group of patients or retrospective datasets. Eight studies reported sensitivity and specificity of 80.28% to 100.0% and 84.0% to 99.0%, respectively. Two studies report accuracies of 78.7% and 81.0%. One study provides an area under the receiver operating curve of 0.955. In addition to diagnostic performance, one study also reported on patient satisfaction, showing that 78% of patients preferred an automated deep learning model over manual human grading.

Conclusions: Advantages of implementing deep learning-based algorithms in DR screening include reduction in manpower, cost of screening, and issues relating to intragrader and intergrader variability. However, limitations that may hinder such an implementation particularly revolve around ethical concerns regarding lack of trust in the diagnostic accuracy of computers. Considering both strengths and limitations, as well as the high performance of deep learning-based algorithms, automated DR classification using deep learning could be feasible in a real-world screening scenario.

PubMed Disclaimer

Similar articles

Cited by

Publication types

LinkOut - more resources