Artificial Versus Human Intelligence in the Diagnostic Approach of Ophthalmic Case Scenarios: A Qualitative Evaluation of Performance and Consistency

Achilleas Mandalos¹, Dimitrios Tsouris²

Affiliations

¹ Ophthalmology, General Hospital of Karditsa, Karditsa, GRC.
² Ophthalmology, General University Hospital of Larissa, Larissa, GRC.

PMID: 39015855
PMCID: PMC11251728
DOI: 10.7759/cureus.62471

Artificial Versus Human Intelligence in the Diagnostic Approach of Ophthalmic Case Scenarios: A Qualitative Evaluation of Performance and Consistency

Achilleas Mandalos et al. Cureus. 2024.

. 2024 Jun 16;16(6):e62471.

doi: 10.7759/cureus.62471. eCollection 2024 Jun.

Authors

Achilleas Mandalos¹, Dimitrios Tsouris²

Affiliations

¹ Ophthalmology, General Hospital of Karditsa, Karditsa, GRC.
² Ophthalmology, General University Hospital of Larissa, Larissa, GRC.

PMID: 39015855
PMCID: PMC11251728
DOI: 10.7759/cureus.62471

Abstract

Purpose: To evaluate the efficiency of three artificial intelligence (AI) chatbots (ChatGPT-3.5 (OpenAI, San Francisco, California, United States), Bing Copilot (Microsoft Corporation, Redmond, Washington, United States), Google Gemini (Google LLC, Mountain View, California, United States)) in assisting the ophthalmologist in the diagnostic approach and management of challenging ophthalmic cases and compare their performance with that of a practicing human ophthalmic specialist. The secondary aim was to assess the short- and medium-term consistency of ChatGPT's responses.

Methods: Eleven ophthalmic case scenarios of variable complexity were presented to the AI chatbots and to an ophthalmic specialist in a stepwise fashion. Advice regarding the initial differential diagnosis, the final diagnosis, further investigation, and management was asked for. One month later, the same process was repeated twice on the same day for ChatGPT only.

Results: The individual diagnostic performance of all three AI chatbots was inferior to that of the ophthalmic specialist; however, they provided useful complementary input in the diagnostic algorithm. This was especially true for ChatGPT and Bing Copilot. ChatGPT exhibited reasonable short- and medium-term consistency, with the mean Jaccard similarity coefficient of responses varying between 0.58 and 0.76.

Conclusion: AI chatbots may act as useful assisting tools in the diagnosis and management of challenging ophthalmic cases; however, their responses should be scrutinized for potential inaccuracies, and by no means can they replace consultation with an ophthalmic specialist.

Keywords: artificial intelligence; bing copilot; challenging ophthalmic cases; chatgpt; diagnostic approach; google gemini.

PubMed Disclaimer

Conflict of interest statement

Human subjects: Consent was obtained or waived by all participants in this study. Animal subjects: All authors have confirmed that this study did not involve animal subjects or tissue. Conflicts of interest: In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.

References

1. ChatGPT: is it good for our glaucoma patients? Wu G, Lee DA, Zhao W, et al. Front Ophthalmol. 2023;3:1260415. - PMC - PubMed
1. Application and accuracy of artificial intelligence-derived large language models in patients with age related macular degeneration. Ferro Desideri L, Roth J, Zinkernagel M, Anguita R. Int J Retina Vitreous. 2023;9:71. - PMC - PubMed
1. Artificial intelligence in ophthalmology: a comparative analysis of GPT-3.5, GPT-4, and human expertise in answering StatPearls questions. Moshirfar M, Altaf AW, Stoakes IM, Tuttle JJ, Hoopes PC. Cureus. 2023;15:0. - PMC - PubMed
1. Assessment of ChatGPT in the prehospital management of ophthalmological emergencies - an analysis of 10 fictional case vignettes. Knebel D, Priglinger S, Scherer N, Klaas J, Siedlecki J, Schworm B. Klin Monbl Augenheilkd. 2024;241:675–681. - PubMed
1. Conversational AI models for ophthalmic diagnosis: comparison of ChatGPT and the Isabel Pro Differential Diagnosis Generator. Balas M, Ing EB. JFO Open Ophthalmol. 2023;1:100005.

LinkOut - more resources

Full Text Sources
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Artificial Versus Human Intelligence in the Diagnostic Approach of Ophthalmic Case Scenarios: A Qualitative Evaluation of Performance and Consistency

Affiliations

Artificial Versus Human Intelligence in the Diagnostic Approach of Ophthalmic Case Scenarios: A Qualitative Evaluation of Performance and Consistency

Authors

Affiliations

Abstract

Conflict of interest statement

References

LinkOut - more resources

Full Text Sources