. 2024 Jan 5;14(2):121.

doi: 10.3390/diagnostics14020121.

Interpretable Detection of Diabetic Retinopathy, Retinal Vein Occlusion, Age-Related Macular Degeneration, and Other Fundus Conditions

Wenlong Li^{1

2}, Linbo Bian^{1

2}, Baikai Ma^{1

2}, Tong Sun^{1

2}, Yiyun Liu^{1

2}, Zhengze Sun^{1

2}, Lin Zhao^{1

2}, Kang Feng^{1

2}, Fan Yang^{1

2}, Xiaona Wang^{1

2}, Szyyann Chan^{1

2}, Hongliang Dou^{1

2}, Hong Qi^{1

2}

Affiliations

¹ Department of Ophthalmology, Peking University Third Hospital, Beijing 100191, China.
² Beijing Key Laboratory of Restoration of Damaged Ocular Nerve, Beijing 100191, China.

PMID: 38247998
PMCID: PMC11487407
DOI: 10.3390/diagnostics14020121

Interpretable Detection of Diabetic Retinopathy, Retinal Vein Occlusion, Age-Related Macular Degeneration, and Other Fundus Conditions

Wenlong Li et al. Diagnostics (Basel). 2024.

. 2024 Jan 5;14(2):121.

doi: 10.3390/diagnostics14020121.

Authors

Affiliations

¹ Department of Ophthalmology, Peking University Third Hospital, Beijing 100191, China.
² Beijing Key Laboratory of Restoration of Damaged Ocular Nerve, Beijing 100191, China.

PMID: 38247998
PMCID: PMC11487407
DOI: 10.3390/diagnostics14020121

Abstract

Diabetic retinopathy (DR), retinal vein occlusion (RVO), and age-related macular degeneration (AMD) pose significant global health challenges, often resulting in vision impairment and blindness. Automatic detection of these conditions is crucial, particularly in underserved rural areas with limited access to ophthalmic services. Despite remarkable advancements in artificial intelligence, especially convolutional neural networks (CNNs), their complexity can make interpretation difficult. In this study, we curated a dataset consisting of 15,089 color fundus photographs (CFPs) obtained from 8110 patients who underwent fundus fluorescein angiography (FFA) examination. The primary objective was to construct integrated models that merge CNNs with an attention mechanism. These models were designed for a hierarchical multilabel classification task, focusing on the detection of DR, RVO, AMD, and other fundus conditions. Furthermore, our approach extended to the detailed classification of DR, RVO, and AMD according to their respective subclasses. We employed a methodology that entails the translation of diagnostic information obtained from FFA results into CFPs. Our investigation focused on evaluating the models' ability to achieve precise diagnoses solely based on CFPs. Remarkably, our models showcased improvements across diverse fundus conditions, with the ConvNeXt-base + attention model standing out for its exceptional performance. The ConvNeXt-base + attention model achieved remarkable metrics, including an area under the receiver operating characteristic curve (AUC) of 0.943, a referable F1 score of 0.870, and a Cohen's kappa of 0.778 for DR detection. For RVO, it attained an AUC of 0.960, a referable F1 score of 0.854, and a Cohen's kappa of 0.819. Furthermore, in AMD detection, the model achieved an AUC of 0.959, an F1 score of 0.727, and a Cohen's kappa of 0.686. Impressively, the model demonstrated proficiency in subclassifying RVO and AMD, showcasing commendable sensitivity and specificity. Moreover, our models enhanced interpretability by visualizing attention weights on fundus images, aiding in the identification of disease findings. These outcomes underscore the substantial impact of our models in advancing the detection of DR, RVO, and AMD, offering the potential for improved patient outcomes and positively influencing the healthcare landscape.

Keywords: age-related macular degeneration; automated detection; diabetic retinopathy; interpretable; retinal vein occlusion.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
CNN + attention modelling framework. The green dashed box highlights the core of the CNN + attention models, namely, the standard Transformer encoder. Patch features are extracted by the ResNet101, EfficientNetV2-M, or ConvNeXt-base CNN architectures. In pure CNN models, patch features undergo average pooling (AvgPool2d) before reaching multitask classification heads within the blue dashed box. In CNN + attention models, the standard Transformer encoder encodes patch features and forwards CLS token to the final classification heads. The classification heads handle primary classes, observing all training data with multilabel asymmetric loss (ASL) calculation, while the three other heads for subclasses only see data with specified subclass information, calculating individual single-label ASLs. The total loss during training is the sum of multilabel ASL and three single-label ASLs.

**Figure 2**
Receiver operating characteristic curves (ROCs) for the primary classes. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.

**Figure 3**
Confusion matrices for the primary classes. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.

**Figure 4**
ROCs for the DR subclasses. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.

**Figure 5**
Confusion matrices for the DR subclasses. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.

**Figure 6**
ROCs for the RVO subclasses. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.

**Figure 7**
Confusion matrices for the RVO subclasses. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.

**Figure 8**
ROCs for the AMD subclasses. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.

**Figure 9**
Confusion matrices for the AMD subclasses. (a) ResNet101, (b) EfficientNetV2-M, (c) ConvNeXt-base, (d) ResNet101 + attention, (e) EfficientNetV2-M + attention, and (f) ConvNeXt-base + attention.

**Figure 10**
Visualization of the attention weights of the ConvNext-base + attention model in CFPs. (a) A case of DR, ME, and sNPDR, (b) a case of RVO, laser spots, and BRVO, and (c) a case of AMD and dry AMD.

See this image and copyright information in PMC

Cited by

A Systematic Review of Advances in AI-Assisted Analysis of Fundus Fluorescein Angiography (FFA) Images: From Detection to Report Generation.
Yu T, Shao A, Wu H, Su Z, Shen W, Zhou J, Lin X, Shi D, Grzybowski A, Wu J, Jin K. Yu T, et al. Ophthalmol Ther. 2025 Apr;14(4):599-619. doi: 10.1007/s40123-025-01109-y. Epub 2025 Feb 21. Ophthalmol Ther. 2025. PMID: 39982648 Free PMC article. Review.
Transformer attention fusion for fine grained medical image classification.
Badar D, Abbas J, Alsini R, Abbas T, ChengLiang W, Daud A. Badar D, et al. Sci Rep. 2025 Jul 1;15(1):20655. doi: 10.1038/s41598-025-07561-x. Sci Rep. 2025. PMID: 40596233 Free PMC article.

References

1. Cheung N., Mitchell P., Wong T.Y. Diabetic retinopathy. Lancet. 2010;376:124–136. doi: 10.1016/S0140-6736(09)62124-3. - DOI - PubMed
1. Song P., Xu Y., Zha M., Zhang Y., Rudan I. Global epidemiology of retinal vein occlusion: A systematic review and meta-analysis of prevalence, incidence, and risk factors. J. Glob. Health. 2019;9:010427. doi: 10.7189/jogh.09.010427. - DOI - PMC - PubMed
1. Laouri M., Chen E., Looman M., Gallagher M. The burden of disease of retinal vein occlusion: Review of the literature. Eye. 2011;25:981–988. doi: 10.1038/eye.2011.92. - DOI - PMC - PubMed
1. Mitchell P., Liew G., Gopinath B., Wong T.Y. Age-related macular degeneration. Lancet. 2018;392:1147–1159. doi: 10.1016/S0140-6736(18)31550-2. - DOI - PubMed
1. Jaulim A., Ahmed B., Khanam T., Chatziralli I.P. Branch retinal vein occlusion: Epidemiology, pathogenesis, risk factors, clinical features, diagnosis, and complications. An update of the literature. Retina. 2013;33:901–910. doi: 10.1097/IAE.0b013e3182870c15. - DOI - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources
- MDPI
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Interpretable Detection of Diabetic Retinopathy, Retinal Vein Occlusion, Age-Related Macular Degeneration, and Other Fundus Conditions

Affiliations

Interpretable Detection of Diabetic Retinopathy, Retinal Vein Occlusion, Age-Related Macular Degeneration, and Other Fundus Conditions

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Related information

Grants and funding

LinkOut - more resources

Full Text Sources