Supervised machine learning compared to large language models for identifying functional seizures from medical records
- PMID: 39960122
- PMCID: PMC11997926
- DOI: 10.1111/epi.18272
Supervised machine learning compared to large language models for identifying functional seizures from medical records
Abstract
Objective: The Functional Seizures Likelihood Score (FSLS) is a supervised machine learning-based diagnostic score that was developed to differentiate functional seizures (FS) from epileptic seizures (ES). In contrast to this targeted approach, large language models (LLMs) can identify patterns in data for which they were not specifically trained. To evaluate the relative benefits of each approach, we compared the diagnostic performance of the FSLS to two LLMs: ChatGPT and GPT-4.
Methods: In total, 114 anonymized cases were constructed based on patients with documented FS, ES, mixed ES and FS, or physiologic seizure-like events (PSLEs). Text-based data were presented in three sequential prompts to the LLMs, showing the history of present illness (HPI), electroencephalography (EEG) results, and neuroimaging results. We compared the accuracy (number of correct predictions/number of cases) and area under the receiver-operating characteristic (ROC) curves (AUCs) of the LLMs to the FSLS using mixed-effects logistic regression.
Results: The accuracy of FSLS was 74% (95% confidence interval [CI] 65%-82%) and the AUC was 85% (95% CI 77%-92%). GPT-4 was superior to both the FSLS and ChatGPT (p <.001), with an accuracy of 85% (95% CI 77%-91%) and AUC of 87% (95% CI 79%-95%). Cohen's kappa between the FSLS and GPT-4 was 40% (fair). The LLMs provided different predictions on different days when the same note was provided for 33% of patients, and the LLM's self-rated certainty was moderately correlated with this observed variability (Spearman's rho2: 30% [fair, ChatGPT] and 63% [substantial, GPT-4]).
Significance: Both GPT-4 and the FSLS identified a substantial subset of patients with FS based on clinical history. The fair agreement in predictions highlights that the LLMs identified patients differently from the structured score. The inconsistency of the LLMs' predictions across days and incomplete insight into their own consistency was concerning. This comparison highlights both benefits and cautions about how machine learning and artificial intelligence could identify patients with FS in clinical practice.
Keywords: electronic health record; informatics; physiologic seizure‐like events; psychogenic nonepileptic seizures (PNES); sensitivity.
© 2025 The Author(s). Epilepsia published by Wiley Periodicals LLC on behalf of International League Against Epilepsy.
Figures
References
-
- Seneviratne U, Low ZM, Low ZX, Hehir A, Paramaswaran S, Foong M, et al. Medical health care utilization cost of patients presenting with psychogenic nonepileptic seizures. Epilepsia. 2019;60(2):349–357. - PubMed
-
- Tan M, Pearce N, Tobias A, Cook MJ, D'Souza WJ. Influence of comorbidity on mortality in patients with epilepsy and psychogenic nonepileptic seizures. Epilepsia. 2023;64(4):1035–1045. - PubMed
-
- Zhang L, Beghi E, Tomson T, Beghi M, Erba G, Chang Z. Mortality in patients with psychogenic non‐epileptic seizures a population‐based cohort study. J Neurol Neurosurg Psychiatry. 2022;93(4):379–385. - PubMed
-
- Nightscales R, McCartney L, Auves C, Tao G, Barnard S, Malpas CB, et al. Mortality in patients with psychogenic nonepileptic seizures. Neurology. 2020;95(6):e643–e652. - PubMed
-
- Kerr WT, Sreenivasan SS, Allas CH, Janio EA, Karimi AH, Dubey I, et al. Title: functional seizures across the adult lifespan: female sex, delay to diagnosis and disability. Seizure. 2021;91:476–483. - PubMed
Publication types
MeSH terms
Grants and funding
- U24 NS107158/NS/NINDS NIH HHS/United States
- Susan S. Spencer/American Academy of Neurology
- R25 NS089450/NS/NINDS NIH HHS/United States
- R25NS089450/NS/NINDS NIH HHS/United States
- K23NS135134/NS/NINDS NIH HHS/United States
- R01NS033310/NS/NINDS NIH HHS/United States
- K23 NS135134/NS/NINDS NIH HHS/United States
- T90 DA022768/DA/NIDA NIH HHS/United States
- R01 NS033310/NS/NINDS NIH HHS/United States
- T90DA022768/NS/NINDS NIH HHS/United States
- R25 NS065723/NS/NINDS NIH HHS/United States
- U24NS107158/NS/NINDS NIH HHS/United States
- R90DA022768/NS/NINDS NIH HHS/United States
- William M. Keck Foundation
- Christina Louise George Trust
- R90DA023422/NS/NINDS NIH HHS/United States
- UE5 NS065723/NS/NINDS NIH HHS/United States
- R90 DA023422/DA/NIDA NIH HHS/United States
- UPMC Competitive Medical Research Fund
- R25NS065723/NS/NINDS NIH HHS/United States
LinkOut - more resources
Full Text Sources
Medical
