Expert-Level Detection of Epilepsy Markers in EEG on Short and Long Timescales
- PMID: 40689158
- PMCID: PMC12276842
- DOI: 10.1056/aioa2401221
Expert-Level Detection of Epilepsy Markers in EEG on Short and Long Timescales
Abstract
Background: Epileptiform discharges, or spikes, within electroencephalogram (EEG) recordings are essential for diagnosing epilepsy and localizing seizure origins. Artificial intelligence (AI) offers a promising approach to automating detection, but current models are often hindered by artifact-related false positives and often target either event- or EEG-level classification, thus limiting clinical utility.
Methods: We developed SpikeNet2, a deep-learning model based on a residual network architecture, and enhanced it with hard-negative mining to reduce false positives. Our study analyzed 17,812 EEG recordings from 13,523 patients across multiple institutions, including Massachusetts General Brigham (MGB) hospitals. Data from the Human Epilepsy Project (HEP) and SCORE-AI (SAI) were also included. A total of 32,433 event-level samples, labeled by experts, were used for training and evaluation. Performance was assessed using the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), calibration error, and a modified area under the curve (mAUC) metric. The model's generalizability was evaluated using external datasets.
Results: SpikeNet2 demonstrated strong performance in event-level spike detection, achieving an AUROC of 0.973 and an AUPRC of 0.995, with 44% of experts surpassing the model on the MGB dataset. In external validation, the model achieved an AUROC of 0.942 and an AUPRC of 0.948 on the HEP dataset. For EEG-level classification, SpikeNet2 recorded an AUROC of 0.958 and an AUPRC of 0.959 on the MGB dataset, an AUROC of 0.888 and an AUPRC of 0.823 on the HEP dataset, and an AUROC of 0.995 and an AUPRC of 0.991 on the SAI dataset, with 32% of experts outperforming the model. The false-positive rate was reduced to an average of nine spikes per hour.
Conclusions: SpikeNet2 offers expert-level accuracy in both event-level spike detection and EEG-level classification, while significantly reducing false positives. Its dual functionality and robust performance across diverse datasets make it a promising tool for clinical and telemedicine applications, particularly in resource-limited settings. (Funded by the National Institutes of Health and others.).
Figures




References
Grants and funding
- I01 HX003107/HX/HSRD VA/United States
- R01 NS102190/NS/NINDS NIH HHS/United States
- R01 AG073410/AG/NIA NIH HHS/United States
- RF1 NS120947/NS/NINDS NIH HHS/United States
- RF1 AG064312/AG/NIA NIH HHS/United States
- P20 GM130447/GM/NIGMS NIH HHS/United States
- R01 NS102574/NS/NINDS NIH HHS/United States
- R21 NS137117/NS/NINDS NIH HHS/United States
- R01 HL161253/HL/NHLBI NIH HHS/United States
- K23 NS124656/NS/NINDS NIH HHS/United States
- UG3 TR004501/TR/NCATS NIH HHS/United States
- R01 AG073598/AG/NIA NIH HHS/United States
- R01 NS126282/NS/NINDS NIH HHS/United States
- R01 NS107291/NS/NINDS NIH HHS/United States
LinkOut - more resources
Full Text Sources