A descriptive study based on the comparison of ChatGPT and evidence-based neurosurgeons
- PMID: 37705958
- PMCID: PMC10495632
- DOI: 10.1016/j.isci.2023.107590
A descriptive study based on the comparison of ChatGPT and evidence-based neurosurgeons
Abstract
ChatGPT is an artificial intelligence product developed by OpenAI. This study aims to investigate whether ChatGPT can respond in accordance with evidence-based medicine in neurosurgery. We generated 50 neurosurgical questions covering neurosurgical diseases. Each question was posed three times to GPT-3.5 and GPT-4.0. We also recruited three neurosurgeons with high, middle, and low seniority to respond to questions. The results were analyzed regarding ChatGPT's overall performance score, mean scores by the items' specialty classification, and question type. In conclusion, GPT-3.5's ability to respond in accordance with evidence-based medicine was comparable to that of neurosurgeons with low seniority, and GPT-4.0's ability was comparable to that of neurosurgeons with high seniority. Although ChatGPT is yet to be comparable to a neurosurgeon with high seniority, future upgrades could enhance its performance and abilities.
Keywords: Artificial intelligence applications; Health informatics; Neurology; Neurosurgery.
© 2023 The Authors.
Conflict of interest statement
The authors declare no competing interests.
Figures



Similar articles
-
Are Different Versions of ChatGPT's Ability Comparable to the Clinical Diagnosis Presented in Case Reports? A Descriptive Study.J Multidiscip Healthc. 2023 Dec 6;16:3825-3831. doi: 10.2147/JMDH.S441790. eCollection 2023. J Multidiscip Healthc. 2023. PMID: 38084123 Free PMC article.
-
ChatGPT versus NASS clinical guidelines for degenerative spondylolisthesis: a comparative analysis.Eur Spine J. 2024 Nov;33(11):4182-4203. doi: 10.1007/s00586-024-08198-6. Epub 2024 Mar 15. Eur Spine J. 2024. PMID: 38489044
-
Evaluating ChatGPT's Performance in Answering Questions About Allergic Rhinitis and Chronic Rhinosinusitis.Otolaryngol Head Neck Surg. 2024 Aug;171(2):571-577. doi: 10.1002/ohn.832. Epub 2024 May 26. Otolaryngol Head Neck Surg. 2024. PMID: 38796735
-
Chat-GPT on brain tumors: An examination of Artificial Intelligence/Machine Learning's ability to provide diagnoses and treatment plans for example neuro-oncology cases.Clin Neurol Neurosurg. 2024 Apr;239:108238. doi: 10.1016/j.clineuro.2024.108238. Epub 2024 Mar 9. Clin Neurol Neurosurg. 2024. PMID: 38507989
-
Can ChatGPT-3.5 Pass a Medical Exam? A Systematic Review of ChatGPT's Performance in Academic Testing.J Med Educ Curric Dev. 2024 Mar 13;11:23821205241238641. doi: 10.1177/23821205241238641. eCollection 2024 Jan-Dec. J Med Educ Curric Dev. 2024. PMID: 38487300 Free PMC article. Review.
Cited by
-
Based on Medicine, The Now and Future of Large Language Models.Cell Mol Bioeng. 2024 Sep 16;17(4):263-277. doi: 10.1007/s12195-024-00820-3. eCollection 2024 Aug. Cell Mol Bioeng. 2024. PMID: 39372551 Review.
-
AI-Powered Neurogenetics: Supporting Patient's Evaluation with Chatbot.Genes (Basel). 2024 Dec 27;16(1):29. doi: 10.3390/genes16010029. Genes (Basel). 2024. PMID: 39858576 Free PMC article.
-
Quality and Dependability of ChatGPT and DingXiangYuan Forums for Remote Orthopedic Consultations: Comparative Analysis.J Med Internet Res. 2024 Mar 14;26:e50882. doi: 10.2196/50882. J Med Internet Res. 2024. PMID: 38483451 Free PMC article.
-
Bridging knowledge gaps in breast cancer prevention: Insights from Ethiopia.World J Clin Oncol. 2025 Jul 24;16(7):106687. doi: 10.5306/wjco.v16.i7.106687. World J Clin Oncol. 2025. PMID: 40741190 Free PMC article.
-
The Diagnostic Ability of GPT-3.5 and GPT-4.0 in Surgery: Comparative Analysis.J Med Internet Res. 2024 Sep 10;26:e54985. doi: 10.2196/54985. J Med Internet Res. 2024. PMID: 39255016 Free PMC article.
References
LinkOut - more resources
Full Text Sources