Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 5;14(2):121.
doi: 10.3390/diagnostics14020121.

Interpretable Detection of Diabetic Retinopathy, Retinal Vein Occlusion, Age-Related Macular Degeneration, and Other Fundus Conditions

Affiliations

Interpretable Detection of Diabetic Retinopathy, Retinal Vein Occlusion, Age-Related Macular Degeneration, and Other Fundus Conditions

Wenlong Li et al. Diagnostics (Basel). .

Abstract

Diabetic retinopathy (DR), retinal vein occlusion (RVO), and age-related macular degeneration (AMD) pose significant global health challenges, often resulting in vision impairment and blindness. Automatic detection of these conditions is crucial, particularly in underserved rural areas with limited access to ophthalmic services. Despite remarkable advancements in artificial intelligence, especially convolutional neural networks (CNNs), their complexity can make interpretation difficult. In this study, we curated a dataset consisting of 15,089 color fundus photographs (CFPs) obtained from 8110 patients who underwent fundus fluorescein angiography (FFA) examination. The primary objective was to construct integrated models that merge CNNs with an attention mechanism. These models were designed for a hierarchical multilabel classification task, focusing on the detection of DR, RVO, AMD, and other fundus conditions. Furthermore, our approach extended to the detailed classification of DR, RVO, and AMD according to their respective subclasses. We employed a methodology that entails the translation of diagnostic information obtained from FFA results into CFPs. Our investigation focused on evaluating the models' ability to achieve precise diagnoses solely based on CFPs. Remarkably, our models showcased improvements across diverse fundus conditions, with the ConvNeXt-base + attention model standing out for its exceptional performance. The ConvNeXt-base + attention model achieved remarkable metrics, including an area under the receiver operating characteristic curve (AUC) of 0.943, a referable F1 score of 0.870, and a Cohen's kappa of 0.778 for DR detection. For RVO, it attained an AUC of 0.960, a referable F1 score of 0.854, and a Cohen's kappa of 0.819. Furthermore, in AMD detection, the model achieved an AUC of 0.959, an F1 score of 0.727, and a Cohen's kappa of 0.686. Impressively, the model demonstrated proficiency in subclassifying RVO and AMD, showcasing commendable sensitivity and specificity. Moreover, our models enhanced interpretability by visualizing attention weights on fundus images, aiding in the identification of disease findings. These outcomes underscore the substantial impact of our models in advancing the detection of DR, RVO, and AMD, offering the potential for improved patient outcomes and positively influencing the healthcare landscape.

Keywords: age-related macular degeneration; automated detection; diabetic retinopathy; interpretable; retinal vein occlusion.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
CNN + attention modelling framework. The green dashed box highlights the core of the CNN + attention models, namely, the standard Transformer encoder. Patch features are extracted by the ResNet101, EfficientNetV2-M, or ConvNeXt-base CNN architectures. In pure CNN models, patch features undergo average pooling (AvgPool2d) before reaching multitask classification heads within the blue dashed box. In CNN + attention models, the standard Transformer encoder encodes patch features and forwards CLS token to the final classification heads. The classification heads handle primary classes, observing all training data with multilabel asymmetric loss (ASL) calculation, while the three other heads for subclasses only see data with specified subclass information, calculating individual single-label ASLs. The total loss during training is the sum of multilabel ASL and three single-label ASLs.
Figure 2
Figure 2
Receiver operating characteristic curves (ROCs) for the primary classes. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.
Figure 3
Figure 3
Confusion matrices for the primary classes. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.
Figure 4
Figure 4
ROCs for the DR subclasses. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.
Figure 5
Figure 5
Confusion matrices for the DR subclasses. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.
Figure 6
Figure 6
ROCs for the RVO subclasses. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.
Figure 7
Figure 7
Confusion matrices for the RVO subclasses. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.
Figure 8
Figure 8
ROCs for the AMD subclasses. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.
Figure 9
Figure 9
Confusion matrices for the AMD subclasses. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.
Figure 10
Figure 10
Visualization of the attention weights of the ConvNext-base + attention model in CFPs. (a) A case of DR, ME, and sNPDR, (b) a case of RVO, laser spots, and BRVO, and (c) a case of AMD and dry AMD.

Similar articles

Cited by

References

    1. Cheung N., Mitchell P., Wong T.Y. Diabetic retinopathy. Lancet. 2010;376:124–136. doi: 10.1016/S0140-6736(09)62124-3. - DOI - PubMed
    1. Song P., Xu Y., Zha M., Zhang Y., Rudan I. Global epidemiology of retinal vein occlusion: A systematic review and meta-analysis of prevalence, incidence, and risk factors. J. Glob. Health. 2019;9:010427. doi: 10.7189/jogh.09.010427. - DOI - PMC - PubMed
    1. Laouri M., Chen E., Looman M., Gallagher M. The burden of disease of retinal vein occlusion: Review of the literature. Eye. 2011;25:981–988. doi: 10.1038/eye.2011.92. - DOI - PMC - PubMed
    1. Mitchell P., Liew G., Gopinath B., Wong T.Y. Age-related macular degeneration. Lancet. 2018;392:1147–1159. doi: 10.1016/S0140-6736(18)31550-2. - DOI - PubMed
    1. Jaulim A., Ahmed B., Khanam T., Chatziralli I.P. Branch retinal vein occlusion: Epidemiology, pathogenesis, risk factors, clinical features, diagnosis, and complications. An update of the literature. Retina. 2013;33:901–910. doi: 10.1097/IAE.0b013e3182870c15. - DOI - PubMed

LinkOut - more resources