Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 27;16(4):e59143.
doi: 10.7759/cureus.59143. eCollection 2024 Apr.

Accuracy of ChatGPT in Neurolocalization

Affiliations

Accuracy of ChatGPT in Neurolocalization

Waleed F Dabbas et al. Cureus. .

Abstract

Introduction ChatGPT (OpenAI Incorporated, Mission District, San Francisco, United States) is an artificial intelligence (AI) chatbot with advanced communication skills and a massive knowledge database. However, its application in medicine, specifically in neurolocalization, necessitates clinical reasoning in addition to deep neuroanatomical knowledge. This article examines ChatGPT's capabilities in neurolocalization. Methods Forty-six text-based neurolocalization case scenarios were presented to ChatGPT-3.5 from November 6th, 2023, to November 16th, 2023. Seven neurosurgeons evaluated ChatGPT's responses to these cases, utilizing a 5-point scoring system recommended by ChatGPT, to score the accuracy of these responses. Results ChatGPT-3.5 achieved an accuracy score of 84.8% in generating "completely correct" and "mostly correct" responses. ANOVA analysis suggested a consistent scoring approach between different evaluators. The mean length of the case text was 69.8 tokens (SD 20.8). Conclusion While this accuracy score is promising, it is not yet reliable for routine patient care. We recommend keeping interactions with ChatGPT concise, precise, and simple to improve response accuracy. As AI continues to evolve, it will hold significant and innovative breakthroughs in medicine.

Keywords: anatomical localization; anatomy; artificial intelligence; brain anatomy; chatgpt; diagnosis; generative pre-trained transformers; neuroanatomy; neurolocalization; neurosurgery.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Neurolocalization case study of peroneal nerve injury presented to ChatGPT-3.5, featuring the examined prompt, ChatGPT's response, and the suggested answer.
ChatGPT's response starts with a summary of the case and provides an answer that was scored as "completely correct" by the evaluator. It subsequently covers the relevant anatomy and pathophysiology, explains how other differential diagnoses were excluded, and concludes by advising further evaluation by a healthcare professional, recommending additional diagnostic tests, and discussing potential treatment options.
Figure 2
Figure 2. Neurolocalization case study of abducent nerve palsy presented to ChatGPT-3.5, featuring the examined prompt, ChatGPT's response, and the suggested answer.
ChatGPT identifies and prioritizes critical information from the neurolocalization case study prompt to construct its response.

References

    1. GPT-4 technical report. Achiam J, Adler S, Agarwal S, et al. arXiv. 2023
    1. Performance of ChatGPT and GPT-4 on Neurosurgery Written Board examinations. Ali R, Tang OY, Connolly ID, et al. medRxiv. 2023 - PubMed
    1. Evaluating the limits of AI in medical specialisation: ChatGPT's performance on the UK Neurology Specialty Certificate Examination. Giannos P. BMJ Neurol Open. 2023;5:0. - PMC - PubMed
    1. Accuracy and reliability of Chatbot responses to physician questions. Goodman RS, Patrinely JR, Stone CA Jr, et al. JAMA Netw Open. 2023;6:0. - PMC - PubMed
    1. Natural language processing. Chowdhury GG. Ann Rev Info Sci Tech. 2003;37:51–89.

LinkOut - more resources