Vision transformers: The next frontier for deep learning-based ophthalmic image analysis
- PMID: 38074310
- PMCID: PMC10701151
- DOI: 10.4103/sjopt.sjopt_91_23
Vision transformers: The next frontier for deep learning-based ophthalmic image analysis
Abstract
Deep learning is the state-of-the-art machine learning technique for ophthalmic image analysis, and convolutional neural networks (CNNs) are the most commonly utilized approach. Recently, vision transformers (ViTs) have emerged as a promising approach, one that is even more powerful than CNNs. In this focused review, we summarized studies that applied ViT-based models to analyze color fundus photographs and optical coherence tomography images. Overall, ViT-based models showed robust performances in the grading of diabetic retinopathy and glaucoma detection. While some studies demonstrated that ViTs were superior to CNNs in certain contexts of use, it is unclear how widespread ViTs will be adopted for ophthalmic image analysis, since ViTs typically require even more training data as compared to CNNs. The studies included were identified from the PubMed and Google Scholar databases using keywords relevant to this review. Only original investigations through March 2023 were included.
Keywords: Color fundus photographs; deep learning; ophthalmic image analysis; optical coherence tomography; vision transformers.
Copyright: © 2023 Saudi Journal of Ophthalmology.
Conflict of interest statement
There are no conflicts of interest.
Similar articles
-
Comparative Analysis of Vision Transformers and Conventional Convolutional Neural Networks in Detecting Referable Diabetic Retinopathy.Ophthalmol Sci. 2024 May 17;4(6):100552. doi: 10.1016/j.xops.2024.100552. eCollection 2024 Nov-Dec. Ophthalmol Sci. 2024. PMID: 39165694 Free PMC article.
-
BUViTNet: Breast Ultrasound Detection via Vision Transformers.Diagnostics (Basel). 2022 Nov 1;12(11):2654. doi: 10.3390/diagnostics12112654. Diagnostics (Basel). 2022. PMID: 36359497 Free PMC article.
-
Multi-class Classification of Retinal Eye Diseases from Ophthalmoscopy Images Using Transfer Learning-Based Vision Transformers.J Imaging Inform Med. 2025 Jan 27. doi: 10.1007/s10278-025-01416-7. Online ahead of print. J Imaging Inform Med. 2025. PMID: 39871038
-
Brain Tumor Diagnosis Using Machine Learning, Convolutional Neural Networks, Capsule Neural Networks and Vision Transformers, Applied to MRI: A Survey.J Imaging. 2022 Jul 22;8(8):205. doi: 10.3390/jimaging8080205. J Imaging. 2022. PMID: 35893083 Free PMC article. Review.
-
Deep learning-based detection of diabetic macular edema using optical coherence tomography and fundus images: A meta-analysis.Indian J Ophthalmol. 2023 May;71(5):1783-1796. doi: 10.4103/IJO.IJO_2614_22. Indian J Ophthalmol. 2023. PMID: 37203031 Free PMC article. Review.
Cited by
-
Using Deep Learning to Distinguish Highly Malignant Uveal Melanoma from Benign Choroidal Nevi.J Clin Med. 2024 Jul 16;13(14):4141. doi: 10.3390/jcm13144141. J Clin Med. 2024. PMID: 39064181 Free PMC article.
-
Ophthalmology's new horizon: Moving from reactive care to proactive artificial intelligence solutions.Saudi J Ophthalmol. 2023 Oct 19;37(3):171-172. doi: 10.4103/sjopt.sjopt_245_23. eCollection 2023 Jul-Sep. Saudi J Ophthalmol. 2023. PMID: 38074309 Free PMC article. No abstract available.
-
Application of artificial intelligence in glaucoma care: An updated review.Taiwan J Ophthalmol. 2024 Sep 13;14(3):340-351. doi: 10.4103/tjo.TJO-D-24-00044. eCollection 2024 Jul-Sep. Taiwan J Ophthalmol. 2024. PMID: 39430354 Free PMC article. Review.
-
Advancing Diabetic Retinopathy Screening: A Systematic Review of Artificial Intelligence and Optical Coherence Tomography Angiography Innovations.Diagnostics (Basel). 2025 Mar 15;15(6):737. doi: 10.3390/diagnostics15060737. Diagnostics (Basel). 2025. PMID: 40150080 Free PMC article. Review.
References
-
- Albawi S, Mohammed TA. and Al-Zawi S. "Understanding of a convolutional neural network," 2017 International Conference on Engineering and Technology (ICET), Antalya, Turkey. 2017:1–6. doi:10.1109/ICEngTechnol.2017.8308186.
-
- Wu JH, Nishida T, Weinreb RN, Lin JW. Performances of machine learning in detecting glaucoma using fundus and retinal optical coherence tomography images: A Meta-Analysis. Am J Ophthalmol. 2022;237:1–12. - PubMed
-
- Naseer MM, Ranasinghe K, Khan SH, Hayat M, Shahbaz Khan F, Yang MH. Intriguing properties of vision transformers. Adv Neural Inf Process Syst. 2021;34:23296–308.
LinkOut - more resources
Full Text Sources