Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2024 Nov 1;35(6):487-493.
doi: 10.1097/ICU.0000000000001089. Epub 2024 Aug 26.

Vision language models in ophthalmology

Affiliations
Review

Vision language models in ophthalmology

Gilbert Lim et al. Curr Opin Ophthalmol. .

Abstract

Purpose of review: Vision Language Models are an emerging paradigm in artificial intelligence that offers the potential to natively analyze both image and textual data simultaneously, within a single model. The fusion of these two modalities is of particular relevance to ophthalmology, which has historically involved specialized imaging techniques such as angiography, optical coherence tomography, and fundus photography, while also interfacing with electronic health records that include free text descriptions. This review then surveys the fast-evolving field of Vision Language Models as they apply to current ophthalmologic research and practice.

Recent findings: Although models incorporating both image and text data have a long provenance in ophthalmology, effective multimodal Vision Language Models are a recent development exploiting advances in technologies such as transformer and autoencoder models.

Summary: Vision Language Models offer the potential to assist and streamline the existing clinical workflow in ophthalmology, whether previsit, during, or post-visit. There are, however, also important challenges to be overcome, particularly regarding patient privacy and explainability of model recommendations.

PubMed Disclaimer

References

    1. Ting DSW, Pasquale LR, Peng L, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol 2019; 103:167–175.
    1. Ting DSW, Cheung CY-L, Lim G, et al. Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA 2017; 318:2211–2223.
    1. Sharma P, Ninomiya T, Omodaka K, et al. A lightweight deep learning model for automatic segmentation and analysis of ophthalmic images. Sci Rep 2022; 12:8508.
    1. Bora A, Balasubramanian S, Babenko B, et al. Predicting the risk of developing diabetic retinopathy using deep learning. Lancet Digit Health 2021; 3:e10–e19.
    1. Vu MH, Löfstedt T, Nyholm T, Sznitman R. A question-centric model for visual question answering in medical imaging. IEEE Trans Med Imaging 2020; 39:2856–2868.