Expert-Level Detection of Epilepsy Markers in EEG on Short and Long Timescales

J Li^{1

2

3}, D M Goldenholz^{2

3}, M Alkofer^{2

3

4}, C Sun^{2

3}, F A Nascimento⁵, J J Halford^{6

7}, B C Dean⁸, M Galanti^{8

9}, A F Struck^{5

10}, A S Greenblatt⁵, A D Lam^{2

11}, A Herlopian¹², C Nwankwo¹³, D Weber¹⁴, D Maus^{2

11}, H A Haider^{15

16}, I Karakis^{17

18}, J Y Yoo¹⁹, M C Ng²⁰, O Selioutski^{21

22}, O Taraschenko²³, G Osman²⁴, R Katyal²⁵, S E Schmitt^{6

26}, S Benbadis^{27

28}, S S Cash^{2

11}, W O Tatum²⁴, Z Sheikh²⁹, W Y Kong^{2

3}, G Bayas^{2

3}, N Turley^{2

3}, S Hong¹, M B Westover^{2

3}, J Jing^{2

3}

Affiliations

¹ National Institute of Health Data Science, Peking University, Beijing.
² Harvard Medical School, Boston.
³ Neurology Department, Beth Israel Deaconess Medical Center, Boston.
⁴ Institute for Theoretical Physics, Technical University Berlin, Berlin.
⁵ Neurology Department, Washington University in St. Louis, St Louis, MO.
⁶ Ralph H. Johnson VA Medical Center, Charleston, SC.
⁷ Electrical and Computer Engineering Department, Clemson University, Clemson, SC.
⁸ Clemson University School of Computing, Clemson, SC.
⁹ Public Health Sciences Department, Medical University of South Carolina, Charleston.
¹⁰ University of Wisconsin-Madison, Madison.
¹¹ Neurology Department, Massachusetts General Hospital, Boston.
¹² Yale University School of Medicine, New Haven, CT.
¹³ Akron Children's Hospital, Akron, OH.
¹⁴ St. Louis University School of Medicine, St Louis, MO.
¹⁵ Neurology Department, University of Chicago, Chicago.
¹⁶ University of Chicago Medical Center, Chicago.
¹⁷ Emory University School of Medicine, Atlanta.
¹⁸ University of Crete School of Medicine, Heraklion, Greece.
¹⁹ Icahn School of Medicine at Mount Sinai, New York, NY.
²⁰ University of Manitoba, Winnipeg, MB, Canada.
²¹ Stony Brook University, Stony Brook, NY.
²² University of Rochester, Rochester, NY.
²³ University of Nebraska Medical Center, Omaha.
²⁴ Mayo Clinic, Jacksonville, FL.
²⁵ Louisiana State University Health Shreveport, Shreveport.
²⁶ Neurology Department, Medical University of South Carolina, Charleston.
²⁷ University of South Florida, Tampa.
²⁸ Tampa General Hospital, Tampa, FL.
²⁹ Neurology Department, Virginia Commonwealth University, Richmond.

PMID: 40689158
PMCID: PMC12276842
DOI: 10.1056/aioa2401221

Expert-Level Detection of Epilepsy Markers in EEG on Short and Long Timescales

J Li et al. NEJM AI. 2025 Jul.

. 2025 Jul;2(7):10.1056/aioa2401221.

doi: 10.1056/aioa2401221. Epub 2025 Jun 26.

Authors

Affiliations

¹ National Institute of Health Data Science, Peking University, Beijing.
² Harvard Medical School, Boston.
³ Neurology Department, Beth Israel Deaconess Medical Center, Boston.
⁴ Institute for Theoretical Physics, Technical University Berlin, Berlin.
⁵ Neurology Department, Washington University in St. Louis, St Louis, MO.
⁶ Ralph H. Johnson VA Medical Center, Charleston, SC.
⁷ Electrical and Computer Engineering Department, Clemson University, Clemson, SC.
⁸ Clemson University School of Computing, Clemson, SC.
⁹ Public Health Sciences Department, Medical University of South Carolina, Charleston.
¹⁰ University of Wisconsin-Madison, Madison.
¹¹ Neurology Department, Massachusetts General Hospital, Boston.
¹² Yale University School of Medicine, New Haven, CT.
¹³ Akron Children's Hospital, Akron, OH.
¹⁴ St. Louis University School of Medicine, St Louis, MO.
¹⁵ Neurology Department, University of Chicago, Chicago.
¹⁶ University of Chicago Medical Center, Chicago.
¹⁷ Emory University School of Medicine, Atlanta.
¹⁸ University of Crete School of Medicine, Heraklion, Greece.
¹⁹ Icahn School of Medicine at Mount Sinai, New York, NY.
²⁰ University of Manitoba, Winnipeg, MB, Canada.
²¹ Stony Brook University, Stony Brook, NY.
²² University of Rochester, Rochester, NY.
²³ University of Nebraska Medical Center, Omaha.
²⁴ Mayo Clinic, Jacksonville, FL.
²⁵ Louisiana State University Health Shreveport, Shreveport.
²⁶ Neurology Department, Medical University of South Carolina, Charleston.
²⁷ University of South Florida, Tampa.
²⁸ Tampa General Hospital, Tampa, FL.
²⁹ Neurology Department, Virginia Commonwealth University, Richmond.

PMID: 40689158
PMCID: PMC12276842
DOI: 10.1056/aioa2401221

Abstract

Background: Epileptiform discharges, or spikes, within electroencephalogram (EEG) recordings are essential for diagnosing epilepsy and localizing seizure origins. Artificial intelligence (AI) offers a promising approach to automating detection, but current models are often hindered by artifact-related false positives and often target either event- or EEG-level classification, thus limiting clinical utility.

Methods: We developed SpikeNet2, a deep-learning model based on a residual network architecture, and enhanced it with hard-negative mining to reduce false positives. Our study analyzed 17,812 EEG recordings from 13,523 patients across multiple institutions, including Massachusetts General Brigham (MGB) hospitals. Data from the Human Epilepsy Project (HEP) and SCORE-AI (SAI) were also included. A total of 32,433 event-level samples, labeled by experts, were used for training and evaluation. Performance was assessed using the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), calibration error, and a modified area under the curve (mAUC) metric. The model's generalizability was evaluated using external datasets.

Results: SpikeNet2 demonstrated strong performance in event-level spike detection, achieving an AUROC of 0.973 and an AUPRC of 0.995, with 44% of experts surpassing the model on the MGB dataset. In external validation, the model achieved an AUROC of 0.942 and an AUPRC of 0.948 on the HEP dataset. For EEG-level classification, SpikeNet2 recorded an AUROC of 0.958 and an AUPRC of 0.959 on the MGB dataset, an AUROC of 0.888 and an AUPRC of 0.823 on the HEP dataset, and an AUROC of 0.995 and an AUPRC of 0.991 on the SAI dataset, with 32% of experts outperforming the model. The false-positive rate was reduced to an average of nine spikes per hour.

Conclusions: SpikeNet2 offers expert-level accuracy in both event-level spike detection and EEG-level classification, while significantly reducing false positives. Its dual functionality and robust performance across diverse datasets make it a promising tool for clinical and telemedicine applications, particularly in resource-limited settings. (Funded by the National Institutes of Health and others.).

PubMed Disclaimer

Figures

**Figure 1.. Data Used in Model Development and Validation.**
Note that patients and electroencephalograms may overlap in different training phases during model development, but there was strictly no intersection between training and test sets. EEG denotes electroencephalogram; HEP, Human Epilepsy Project; MGB, Massachusetts General Brigham; and SAI, SCORE–Artificial Intelligence.

**Figure 2.. The Pipeline of Hard-Negative Mining.**
EEG denotes electroencephalogram; and IED, interictal epileptiform discharges.

**Figure 3.. Event-Level Spike-Classification Performance of SpikeNet2 Compared with Benchmark Models.**
Panel A shows the receiver operating characteristic (ROC) curve, Panel B the precision–recall (PR) curve, and Panel C the calibration curve for the Massachusetts General Brigham (MGB) test dataset, with 16 human raters’ operating points shown for comparison. SpikeNet2 (SN2b) performance is color-coded in green, SpikeNet1 (SN1) in blue, and SpikeNet2 before hard-negative mining (SN2a) in red for comparison. Panel D shows the ROC curve, Panel E the PR curve, and Panel F the calibration curve for SpikeNet2 and comparators on the Human Epilepsy Project external validation dataset. Panel G shows a modified ROC curve and Panel H a zoomed-in modified ROC curve on the MGB control test dataset. Figures in parentheses denote 95% confidence intervals. AUC denotes area under the curve; BS, Brier (calibration) score; EBSN2b, the percentage of experts who outperform SN2b; FP, false positive; FPR, false-positive rate; HEP, Human Epilepsy Project; mAUC, normalized area under the modified receiver operating characteristic curve; MGB, Massachusetts General Brigham; PPV, positive predictive value; SN1, SpikeNet1; SN2a, SpikeNet2 without hard-negative mining; SN2b, SpikeNet2 with hard-negative mining; and TPR, true-positive rate.

**Figure 4.. EEG-Level Spike-Classification Performance of SpikeNet2 Compared with Benchmark Models.**
Panel A shows the receiver operating characteristic (ROC) curve and Panel B the precision–recall (PR) curve of SpikeNet2 on the Massachusetts General Brigham test set. Panel C shows the ROC curve, and Panel D shows the PR curve of SpikeNet2 on the Human Epilepsy Project external validation dataset. Panel E shows the ROC curve, and Panel F shows the PR curve of SpikeNet2 and the comparator model (SCORE-AI) on the SCORE-AI external validation dataset, with operating points of 14 human raters shown for comparison. Figures in parentheses denote 95% confidence intervals. AUC denotes area under the curve; EBSN2, the percentage of experts who outperform SN2-EEG; FPR, false-positive rate; HEP, Human Epilepsy Project; MGB, Massachusetts General Brigham; PPV, positive predictive value; SAI, SCORE–Artificial Intelligence; SN2-EEG, SpikeNet2 for electroencephalography-level task; and TPR, true-positive rate.

See this image and copyright information in PMC

References

1. Tatum WO, Rubboli G, Kaplan PW, et al. Clinical utility of EEG in diagnosing and monitoring epilepsy in adults. Clin Neurophysiol 2018;129:1056–1082. DOI: 10.1016/j.clinph.2018.01.019. - DOI - PubMed
1. Van Donselaar CA, Schimsheimer RJ, Geerts AT, Declerck AC. Value of the electroencephalogram in adult patients with untreated idiopathic first seizures. Arch Neurol 1992;49:231–237. DOI: 10.1001/archneur.1992.00530270045017. - DOI - PubMed
1. Thijs RD, Surges R, O’Brien TJ, Sander JW. Epilepsy in adults. Lancet 2019;393:689–701. DOI: 10.1016/S0140-6736(18)32596-0 - DOI - PubMed
1. Kane N, Acharya J, Beniczky S, et al. A revised glossary of terms most commonly used by clinical electroencephalographers and updated proposal for the report format of the EEG findings. Revision 2017. Clin Neurophysiol Pract 2017;2:170–185. DOI: 10.1016/j.cnp.2017.07.002. - DOI - PMC - PubMed
1. Nascimento FA, Barfuss JD, Jaffe A, Westover MB, Jing J. A quantitative approach to evaluating interictal epileptiform discharges based on interpretable quantitative criteria. Clin Neurophysiol 2022;146:10–17. DOI: 10.1016/j.clinph.2022.10.018. - DOI - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Expert-Level Detection of Epilepsy Markers in EEG on Short and Long Timescales

Affiliations

Expert-Level Detection of Epilepsy Markers in EEG on Short and Long Timescales

Authors

Affiliations

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources