Evaluating the performance of large language models: ChatGPT and Google Bard in generating differential diagnoses in clinicopathological conferences of neurodegenerative disorders

Shunsuke Koga¹, Nicholas B Martin¹, Dennis W Dickson¹

Affiliations

PMID: 37553205
PMCID: PMC11006994
DOI: 10.1111/bpa.13207

Evaluating the performance of large language models: ChatGPT and Google Bard in generating differential diagnoses in clinicopathological conferences of neurodegenerative disorders

Shunsuke Koga et al. Brain Pathol. 2024 May.

. 2024 May;34(3):e13207.

doi: 10.1111/bpa.13207. Epub 2023 Aug 8.

Authors

Shunsuke Koga¹, Nicholas B Martin¹, Dennis W Dickson¹

Affiliation

¹ Department of Neuroscience, Mayo Clinic, Jacksonville, Florida, USA.

PMID: 37553205
PMCID: PMC11006994
DOI: 10.1111/bpa.13207

Abstract

This study explores the utility of the large language models (LLMs), specifically ChatGPT and Google Bard, in predicting neuropathologic diagnoses from clinical summaries. A total of 25 cases of neurodegenerative disorders presented at Mayo Clinic brain bank Clinico-Pathological Conferences were analyzed. The LLMs provided multiple pathologic diagnoses and their rationales, which were compared with the final clinical diagnoses made by physicians. ChatGPT-3.5, ChatGPT-4, and Google Bard correctly made primary diagnoses in 32%, 52%, and 40% of cases, respectively, while correct diagnoses were included in 76%, 84%, and 76% of cases, respectively. These findings highlight the potential of artificial intelligence tools like ChatGPT in neuropathology, suggesting they may facilitate more comprehensive discussions in clinicopathological conferences.

Keywords: CPC; ChatGPT; Google Bard; artificial intelligence; clinicopathological conference; large language model; neuropathology; pathology.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

References

1. Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepano C, et al. Performance of ChatGPT on USMLE: potential for AI‐assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198. - PMC - PubMed
1. Biswas S. Passing is great: can ChatGPT conduct USMLE exams? Ann Biomed Eng. 2023. 10.1007/s10439-023-03224-y - DOI - PubMed
1. Koga S. The potential of ChatGPT in medical education: focusing on USMLE preparation. Ann Biomed Eng. 2023. 10.1007/s10439-023-03253-7 - DOI - PubMed
1. Ayers JW, Poliak A, Dredze M, Leas EC, Zhu Z, Kelley JB, et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med. 2023;183(6):589–596. - PMC - PubMed
1. Hughes AJ, Daniel SE, Ben‐Shlomo Y, Lees AJ. The accuracy of diagnosis of parkinsonian syndromes in a specialist movement disorder service. Brain. 2002;125(Pt 4):861–870. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Evaluating the performance of large language models: ChatGPT and Google Bard in generating differential diagnoses in clinicopathological conferences of neurodegenerative disorders

Affiliation

Evaluating the performance of large language models: ChatGPT and Google Bard in generating differential diagnoses in clinicopathological conferences of neurodegenerative disorders

Authors

Affiliation

Abstract

Conflict of interest statement

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical