Comparative analysis of generic vision-language models in detecting and diagnosing inherited retinal diseases using fundus photographs

Xiang Meng^#^{1

2

3

4}, Wendy Meihua Wong^#^{1

2

5}, Krithi Pushpanathan^{1

2}, Sahana Srinivasan⁶, Cancan Xue⁶, Meng Wang^{1

2}, Heng Miao^{3

7}, Hengtong Li^{1

2}, Liping Yang^{4

8}, Ling-Ping Cen⁹, Li Jia Chen¹⁰, Hwei Wuen Chan^{1

2

5}, Yih-Chung Tham^{11

12

13}, Ching-Yu Cheng^{14

15

16}

Affiliations

¹ Centre for Innovation & Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
² Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
³ Department of Ophthalmology, Peking University People's Hospital, Beijing, China.
⁴ Department of Ophthalmology, Third Hospital, Peking University, Beijing, China.
⁵ Department of Ophthalmology, National University Hospital, Singapore, Singapore.
⁶ Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore.
⁷ Beijing Key Laboratory of Ocular Disease and Optometry Science, Beijing, China.
⁸ Beijing Key Laboratory of Restoration of Damaged Ocular Nerve, Peking University Third Hospital, Beijing, China.
⁹ Guangdong Provincial Key Laboratory of Medical Immunology and Molecular Diagnostics, School of Medical Technology, Guangdong Medical University, Zhanjiang, China.
¹⁰ Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong, China.
¹¹ Centre for Innovation & Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore. thamyc@nus.edu.sg.
¹² Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore. thamyc@nus.edu.sg.
¹³ Department of Ophthalmology, National University Hospital, Singapore, Singapore. thamyc@nus.edu.sg.
¹⁴ Centre for Innovation & Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore. chingyu.cheng@nus.edu.sg.
¹⁵ Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore. chingyu.cheng@nus.edu.sg.
¹⁶ Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore. chingyu.cheng@nus.edu.sg.

^# Contributed equally.

PMID: 41057716
DOI: 10.1038/s41433-025-04013-8

Comparative analysis of generic vision-language models in detecting and diagnosing inherited retinal diseases using fundus photographs

Xiang Meng et al. Eye (Lond). 2025.

. 2025 Oct 7.

doi: 10.1038/s41433-025-04013-8. Online ahead of print.

Authors

Affiliations

¹ Centre for Innovation & Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
² Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
³ Department of Ophthalmology, Peking University People's Hospital, Beijing, China.
⁴ Department of Ophthalmology, Third Hospital, Peking University, Beijing, China.
⁵ Department of Ophthalmology, National University Hospital, Singapore, Singapore.
⁶ Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore.
⁷ Beijing Key Laboratory of Ocular Disease and Optometry Science, Beijing, China.
⁸ Beijing Key Laboratory of Restoration of Damaged Ocular Nerve, Peking University Third Hospital, Beijing, China.
⁹ Guangdong Provincial Key Laboratory of Medical Immunology and Molecular Diagnostics, School of Medical Technology, Guangdong Medical University, Zhanjiang, China.
¹⁰ Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong, China.
¹¹ Centre for Innovation & Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore. thamyc@nus.edu.sg.
¹² Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore. thamyc@nus.edu.sg.
¹³ Department of Ophthalmology, National University Hospital, Singapore, Singapore. thamyc@nus.edu.sg.
¹⁴ Centre for Innovation & Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore. chingyu.cheng@nus.edu.sg.
¹⁵ Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore. chingyu.cheng@nus.edu.sg.
¹⁶ Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore. chingyu.cheng@nus.edu.sg.

^# Contributed equally.

PMID: 41057716
DOI: 10.1038/s41433-025-04013-8

Abstract

Background: To evaluate the clinical applicability of three generic Vision-Large-Language Models (VLLMs) - OpenAI's GPT-4omni, GPT-4V(ision) and Google's Gemini in detecting and diagnosing inherited retinal diseases (IRDs), using fundus photographs.

Methods: The head-to-head comparative study curated 60 ultra-widefield (UWF) fundus images of 30 IRD patients from the National University Hospital, Singapore. Additionally, ten normal, open-sourced UWF fundus images were included for comparison. The 70 fundus images were analysed by the three VLLMs using standardised prompts to generate descriptions of 10 specified retinal features and provide clinical insights. Each VLLM received 2100 scores for descriptions across ten features, rated by three blinded consultant-level graders using three-point scale (0 = poor, 1 = borderline, 2 = good). Clinical insights including disease detection, diagnosis and pathological gene inference evaluated against clinical ground-truth.

Results: GPT-4o achieved the highest mean quality score in feature description (1.64 [0.697], mean [SEM]), outperforming GPT-4V (1.57 [0.738]) and Gemini (1.46 [0.800]; both p < 0.001). All models demonstrated high detection accuracy ( $\geq$ 81.4%), but Gemini incorrectly classified all normal fundus images as IRD. GPT-4omni (65.7%) outperformed GPT-4V (50%) and Gemini (60%) in diagnosis accuracy. Gene inference precision remained low ( $\leq$ 20.3%) across all models. High concordance was observed across all models between feature descriptions and diagnoses ( $\geq$ 97.1%), between diagnoses and clinical recommendations (100%).

Conclusions: GPT-4omni and GPT-4V demonstrated promising potential in detecting IRDs from fundus photographs, with good feature extraction capabilities and high detection accuracy. Gemini struggled with misidentifying normal fundus images. All three VLLMs require further refinement to improve diagnostic accuracy and gene inference.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

References

1. Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, et al. Large language models encode clinical knowledge. Nature. 2023;620:172–80. - DOI - PubMed - PMC
1. Liévin V, Hotherc E, Motzfeldt AG, Winther O. Can large language models reason about medical questions?. ArXiv. 2023. https://arxiv.org/abs/2207.08143 .
1. Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2:e0000198.
1. Antaki F, Milad D, Chia MA, Giguère C, Touma S, El-Khoury J, et al. Capabilities of GPT-4 in ophthalmology: an analysis of model entropy and progress towards human-level medical question answering. Br J Ophthalmol. 2024;108:1371–8. - DOI - PubMed
1. Measuring performance on the Healthcare Access and Quality Index for 195 countries and territories and selected subnational locations: a systematic analysis from the Global Burden of Disease Study 2016. Lancet. 2018;391:2236–71.

Grants and funding

MOH-CSASI22jul-0001/MOH | National Medical Research Council (NMRC)

LinkOut - more resources

Full Text Sources
- Nature Publishing Group

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Comparative analysis of generic vision-language models in detecting and diagnosing inherited retinal diseases using fundus photographs

Affiliations

Comparative analysis of generic vision-language models in detecting and diagnosing inherited retinal diseases using fundus photographs

Authors

Affiliations

Abstract

Conflict of interest statement

References

Grants and funding

LinkOut - more resources

Full Text Sources