Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 19;15(14):1823.
doi: 10.3390/diagnostics15141823.

Can YOLO Detect Retinal Pathologies? A Step Towards Automated OCT Analysis

Affiliations

Can YOLO Detect Retinal Pathologies? A Step Towards Automated OCT Analysis

Adriana-Ioana Ardelean et al. Diagnostics (Basel). .

Abstract

Background: Optical Coherence Tomography has become a common imaging technique that enables a non-invasive and detailed visualization of the retina and allows for the identification of various diseases. Through the advancement of technology, the volume and complexity of OCT data have rendered manual analysis infeasible, creating the need for automated means of detection. Methods: This study investigates the ability of state-of-the-art object detection models, including the latest YOLO versions (from v8 to v12), YOLO-World, YOLOE, and RT-DETR, to accurately detect pathological biomarkers in two retinal OCT datasets. The AROI dataset focuses on fluid detection in Age-related Macular Degeneration, while the OCT5k dataset contains a wide range of retinal pathologies. Results: The experiments performed show that YOLOv12 offers the best balance between detection accuracy and computational efficiency, while YOLOE manages to consistently outperform all other models across both datasets and most classes, particularly in detecting pathologies that cover a smaller area. Conclusions: This work provides a comprehensive benchmark of the capabilities of state-of-the-art object detection for medical applications, specifically for identifying retinal pathologies from OCT scans, offering insights and a starting point for the development of future automated solutions for analysis in a clinical setting.

Keywords: YOLO; neural networks; object detection; optical coherence tomography; retinal OCT.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
Class distribution of biomarkers in the AROI dataset (class 1 represents PED, class 2 represents SRF, and class 3 represents IRF).
Figure 2
Figure 2
Original image, original segmentation, and YOLO labels transform (class 1/orange represents PED, class 2/green represents SRF, and class 3/red represents IRF). There tend to be more instances of IRF that cover small regions, which may even overlap when translated to bounding boxes. Due to the segmentation, even the bounding boxes of PED and SRF overlap with each other, and sometimes with the smaller IRF.
Figure 3
Figure 3
Example of an image from the OCT5k dataset with the bounding boxes given for each class instance (class 1/orange represents Choroidalfolds, class 2/green represents Fluid, class 3/red represents Geographicatrophy, class 4/purple represents Harddrusen, class 5/brown represents Hyperfluorescentspots, class 6/pink represents PRlayerdisruption, class 7/gray represents Reticulardrusen, class 8/olive represents Softdrusen, class 9/cyan represents SoftdrusenPED).
Figure 4
Figure 4
YOLOv12 train and validation on the AROI dataset showing the model is capable of attaining stability during the training process with minimal overfitting due to the alignment of training and validation.
Figure 5
Figure 5
YOLOv12 F1-confidence curve on the AROI dataset showing the severe drop in performance as the confidence threshold increases for the IRF class.
Figure 6
Figure 6
YOLOv12 validation examples on the AROI dataset showing the general capability of the model to correctly identify the PED (orange) and SRF (green), while the IRF (red) class is harder to distinguish from the background, and aligning the bounding box is harder to accomplish due to it occupying only small regions.
Figure 7
Figure 7
YOLOE training and validation on the AROI dataset showing the model is capable of attaining stability during the training process with minimal overfitting due to the alignment of training and validation.
Figure 8
Figure 8
YOLOE F1-confidence curve on the AROI dataset showing the severe drop in performance as the confidence threshold increases for the IRF class.
Figure 9
Figure 9
YOLOE validation comparison between ground truth and predictions on the AROI dataset showing the general capability of the model to correctly identify the PED (orange) and SRF (green), while the IRF (red) class is harder to distinguish from the background, and aligning the bounding box is harder to accomplish due to it occupying only small regions.
Figure 10
Figure 10
YOLOv12 training and validation on the OCT5k dataset showing the model is capable of attaining stability during the training process with minimal overfitting due to the alignment of training and validation.
Figure 11
Figure 11
YOLOv12 F1-confidence curve on the OCT5k dataset showing the severe drop in performance as the confidence threshold increases for the PRlayerdisruption class, while the Fluid and Hyperflourecentspots are at 0.
Figure 12
Figure 12
YOLOv12 validation comparison between ground truth and predictions on the OCT5k dataset showing, again, that the model is unable to correctly identify the Fluid class (green) and the Hyperflourescentspots (brown), while the smaller instances of Softdrusen (olive) and Harddrusen (purple) are not found in the predictions of the model.
Figure 13
Figure 13
YOLOE training and validation on the OCT5k dataset showing the model is capable of attaining stability during the training process with minimal overfitting due to the alignment of training and validation.
Figure 14
Figure 14
YOLOE F1-confidence curve on the OCT5k dataset showing the severe drop in performance as the confidence threshold increases for the PRlayerdisruption class, while the Fluid and Hyperflourecentspots are at 0.
Figure 15
Figure 15
YOLOE validation comparison between ground truth and predictions on the OCT5k dataset, showing again that the model is unable to correctly identify the Fluid class (green) and the Hyperflourescentspots (brown), while the smaller instances of Softdrusen (olive) and Harddrusen (purple) are not found in the predictions of the model.

References

    1. Aumann S., Donner S., Fischer J., Müller F. Optical Coherence Tomography (OCT): Principle and Technical Realization. In: Bille J.F., editor. High Resolution Imaging in Microscopy and Ophthalmology: New Frontiers in Biomedical Optics. Springer; Cham, Switzerland: 2019. [(accessed on 26 May 2025)]. Available online: http://www.ncbi.nlm.nih.gov/books/NBK554044/ - PubMed
    1. Zysk A.M., Nguyen F.T., Oldenburg A.L., Marks D.L., Boppart S.A. Optical coherence tomography: A review of clinical development from bench to bedside. J. Biomed. Opt. 2007;12:051403. doi: 10.1117/1.2793736. - DOI - PubMed
    1. Drexler W., Morgner U., Ghanta R.K., Kärtner F.X., Schuman J.S., Fujimoto J.G. Ultrahigh-resolution ophthalmic optical coherence tomography. Nat. Med. 2001;7:502–507. doi: 10.1038/86589. - DOI - PMC - PubMed
    1. Wu Y., Zhou Y., Zhao J., Yang J., Yu W., Chen Y., Li X. Lesion Localization in OCT by Semi-Supervised Object Detection; Proceedings of the 2022 International Conference on Multimedia Retrieval; New York, NY, USA. 27 June 2022; pp. 639–646. - DOI
    1. Brehar R., Groza A., Damian I., Muntean G., Nicoară S.D. Age-Related Macular Degeneration Biomarker Segmentation from OCT images; Proceedings of the 2023 24th International Conference on Control Systems and Computer Science (CSCS); Bucharest, Romania. 24–26 May 2023; pp. 444–451. - DOI