Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 9;5(6):100847.
doi: 10.1016/j.xops.2025.100847. eCollection 2025 Nov-Dec.

A Practical Guide to Evaluating Artificial Intelligence Imaging Models in Scientific Literature

Affiliations

A Practical Guide to Evaluating Artificial Intelligence Imaging Models in Scientific Literature

Angela McCarthy et al. Ophthalmol Sci. .

Abstract

Objective: Recent advances in artificial intelligence (AI) are revolutionizing ophthalmology by enhancing diagnostic accuracy, treatment planning, and patient management. However, a significant gap remains in practical guidance for ophthalmologists who lack AI expertise to effectively analyze these technologies and assess their readiness for integration into clinical practice. This paper aims to bridge this gap by demystifying AI model design and providing practical recommendations for evaluating AI imaging models in research publications.

Design: Educational review: synthesizing key considerations for evaluating AI papers in ophthalmology.

Participants: This paper draws on insights from an interdisciplinary team of ophthalmologists and AI experts with experience in developing and evaluating AI models for clinical applications.

Methods: A structured framework was developed based on expert discussions and a review of key methodological considerations in AI research.

Main outcome measures: A stepwise approach to evaluating AI models in ophthalmology, providing clinicians with practical strategies for assessing AI research.

Results: This guide offers broad recommendations applicable across ophthalmology and medicine.

Conclusions: As the landscape of health care continues to evolve, proactive engagement with AI will empower clinicians to lead the way in innovation while concurrently prioritizing patient safety and quality of care.

Financial disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

Keywords: Artificial intelligence; Glaucoma detection; Machine learning; Ophthalmology.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Three variations of the same OCT image using data augmentation techniques. The original image (left) is modified to create new computer-generated versions by adding Gaussian noise (center) and applying a horizontal flip (right). These transformations simulate variability in the dataset, enhancing the AI model's ability to generalize. AI = artificial intelligence.
Figure 2
Figure 2
Grad-CAM highlights regions most important in the AI model's decision-making process, enhancing interpretability. In this case, the Grad-CAM (right) illustrates that model relies on the RNFL probability map's arcuate region to assess glaucoma, as indicated by the yellow highlights. This provides a visual explanation of the AI's focus during analysis. The color scale (yellow to blue) indicates the relevance of different regions, with yellow showing the highest relevance. AI = artificial intelligence; Grad-CAM = Gradient-Weighted Class Activation Maps; GCL = Ganglion Cell Layer; RNFL = retinal nerve fiber layer; VF = visual fields.
Figure 3
Figure 3
Precision-recall curve. AUC = area under the curve; PR = precision-recall.
Figure 4
Figure 4
Receiver operating characteristic curve. ROC = receiver operating characteristic.

Similar articles

References

    1. Li Z., Wang L., Wu X., et al. Artificial intelligence in ophthalmology: the path to the real-world clinic. Cell Rep Med. 2023;4 - PMC - PubMed
    1. Ting D.S.W., Pasquale L.R., Peng L., et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol. 2019;103:167–175. - PMC - PubMed
    1. Radgoudarzi N., Hallaj S., Boland M.V., et al. Barriers to extracting and harmonizing glaucoma testing data: gaps, shortcomings, and the pursuit of FAIRness. Ophthalmol Sci. 2024;4 - PMC - PubMed
    1. Lee A.Y., Campbell J.P., Hwang T.S., et al. Recommendations for standardization of images in ophthalmology. Ophthalmology. 2021;128:969–970. - PMC - PubMed
    1. Ting D.S.W., Lee A.Y., Wong T.Y. An ophthalmologist's guide to deciphering studies in artificial intelligence. Ophthalmology. 2019;126:1475–1479. - PMC - PubMed

LinkOut - more resources