Evaluating the performance of large language models: ChatGPT and Google Bard in generating differential diagnoses in clinicopathological conferences of neurodegenerative disorders
- PMID: 37553205
- PMCID: PMC11006994
- DOI: 10.1111/bpa.13207
Evaluating the performance of large language models: ChatGPT and Google Bard in generating differential diagnoses in clinicopathological conferences of neurodegenerative disorders
Abstract
This study explores the utility of the large language models (LLMs), specifically ChatGPT and Google Bard, in predicting neuropathologic diagnoses from clinical summaries. A total of 25 cases of neurodegenerative disorders presented at Mayo Clinic brain bank Clinico-Pathological Conferences were analyzed. The LLMs provided multiple pathologic diagnoses and their rationales, which were compared with the final clinical diagnoses made by physicians. ChatGPT-3.5, ChatGPT-4, and Google Bard correctly made primary diagnoses in 32%, 52%, and 40% of cases, respectively, while correct diagnoses were included in 76%, 84%, and 76% of cases, respectively. These findings highlight the potential of artificial intelligence tools like ChatGPT in neuropathology, suggesting they may facilitate more comprehensive discussions in clinicopathological conferences.
Keywords: CPC; ChatGPT; Google Bard; artificial intelligence; clinicopathological conference; large language model; neuropathology; pathology.
© 2023 The Authors. Brain Pathology published by John Wiley & Sons Ltd on behalf of International Society of Neuropathology.
Conflict of interest statement
The authors declare no conflicts of interest.
References
-
- Hughes AJ, Daniel SE, Ben‐Shlomo Y, Lees AJ. The accuracy of diagnosis of parkinsonian syndromes in a specialist movement disorder service. Brain. 2002;125(Pt 4):861–870. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical