A public benchmark for human performance in the detection of focal cortical dysplasia

Lennart Walger^{1

2}, Matthias H Schmitz^{1

2}, Tobias Bauer^{1

2

3}, David Kügler³, Fabiane Schuch², Christophe Arendt⁴, Tobias Baumgartner², Johannes Birkenheier⁵, Valeri Borger⁶, Christoph Endler⁷, Franziska Grau¹, Christian Immanuel⁷, Markus Kölle⁸, Patrick Kupczyk⁷, Asadeh Lakghomi^{1

7}, Sarah Mackert⁸, Elisabeth Neuhaus⁴, Julia Nordsiek⁵, Anna-Maria Odenthal⁷, Karmele Olaciregui Dague², Laura Ostermann², Jan Pukropski², Attila Racz², Klaus von der Ropp², Frederic Carsten Schmeel¹, Felix Schrader⁸, Aileen Sitter⁸, Alexander Unruh-Pinheiro², Marilia Voigt⁷, Martin Vychopen⁶, Philip von Wedel^{2

9}, Randi von Wrede², Ulrike Attenberger⁷, Hartmut Vatter⁶, Alexandra Philipsen⁸, Albert Becker¹⁰, Martin Reuter^{3

11

12}, Elke Hattingen⁴, Alexander Radbruch^{1

3

13}, Rainer Surges², Theodor Rüber^{1

2

3

13}

Affiliations

¹ Department of Neuroradiology, University Hospital Bonn, Bonn, Germany.
² Department of Epileptology, University Hospital Bonn, Bonn, Germany.
³ German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany.
⁴ Department of Neuroradiology, Goethe University Frankfurt, Frankfurt, Germany.
⁵ Department of Neurology, University Hospital Bonn, Bonn, Germany.
⁶ Department of Neurosurgery, University Hospital Bonn, Bonn, Germany.
⁷ Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Bonn, Germany.
⁸ Department of Psychiatry and Psychotherapy, University Hospital Bonn, Bonn, Germany.
⁹ Chair of Economic & Social Policy, WHU - Otto Beisheim School of Management, Vallendar, Germany.
¹⁰ Department of Neuropathology, University Hospital Bonn, Bonn, Germany.
¹¹ A. A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, Massachusetts, USA.
¹² Department of Radiology, Harvard Medical School, Boston, Massachusetts, USA.
¹³ Center for Medical Data Usability and Translation, University of Bonn, Bonn, Germany.

PMID: 40167314
PMCID: PMC12163524
DOI: 10.1002/epi4.70028

A public benchmark for human performance in the detection of focal cortical dysplasia

Lennart Walger et al. Epilepsia Open. 2025 Jun.

. 2025 Jun;10(3):778-786.

doi: 10.1002/epi4.70028. Epub 2025 Apr 1.

Authors

Affiliations

¹ Department of Neuroradiology, University Hospital Bonn, Bonn, Germany.
² Department of Epileptology, University Hospital Bonn, Bonn, Germany.
³ German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany.
⁴ Department of Neuroradiology, Goethe University Frankfurt, Frankfurt, Germany.
⁵ Department of Neurology, University Hospital Bonn, Bonn, Germany.
⁶ Department of Neurosurgery, University Hospital Bonn, Bonn, Germany.
⁷ Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Bonn, Germany.
⁸ Department of Psychiatry and Psychotherapy, University Hospital Bonn, Bonn, Germany.
⁹ Chair of Economic & Social Policy, WHU - Otto Beisheim School of Management, Vallendar, Germany.
¹⁰ Department of Neuropathology, University Hospital Bonn, Bonn, Germany.
¹¹ A. A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, Massachusetts, USA.
¹² Department of Radiology, Harvard Medical School, Boston, Massachusetts, USA.
¹³ Center for Medical Data Usability and Translation, University of Bonn, Bonn, Germany.

PMID: 40167314
PMCID: PMC12163524
DOI: 10.1002/epi4.70028

Abstract

Objective: This study aims to report human performance in the detection of Focal Cortical Dysplasias (FCDs) using an openly available dataset. Additionally, it defines a subset of this data as a "difficult" test set to establish a public baseline benchmark against which new methods for automated FCD detection can be evaluated.

Methods: The performance of 28 human readers with varying levels of expertise in detecting FCDs was originally analyzed using 146 subjects (not all of which are openly available), we analyzed the openly available subset of 85 cases. Performance was measured based on the overlap between predicted regions of interest (ROIs) and ground-truth lesion masks, using the Dice-Soerensen coefficient (DSC). The benchmark test set was chosen to consist of 15 subjects most predictive for human performance and 13 subjects identified by at most 3 of the 28 readers.

Results: Expert readers achieved an average detection rate of 68%, compared to 45% for non-experts and 27% for laypersons. Neuroradiologists detected the highest percentage of lesions (64%), while psychiatrists detected the least (34%). Neurosurgeons had the highest ROI sensitivity (0.70), and psychiatrists had the highest ROI precision (0.78). The benchmark test set revealed an expert detection rate of 49%.

Significance: Reporting human performance in FCD detection provides a critical baseline for assessing the effectiveness of automated detection methods in a clinically relevant context. The defined benchmark test set serves as a useful indicator for evaluating advancements in computer-aided FCD detection approaches.

Plain language summary: Focal cortical dysplasias (FCDs) are malformations of cortical development and one of the most common causes of drug-resistant focal epilepsy. Once found, FCDs can be neurosurgically resected, which leads to seizure freedom in many cases. However, FCDs are difficult to detect in the visual assessment of magnetic resonance imaging. A myriad of algorithms for automated FCD detection have been developed, but their true clinical value remains unclear since there is no benchmark dataset for evaluation and comparison to human performance. Here, we use human FCD detection performance to define a benchmark dataset with which new methods for automated detection can be evaluated.

Keywords: artificial intelligence; computer‐aided detection; human performance; reader study.

PubMed Disclaimer

Conflict of interest statement

AR has received fees as a speaker from UCB Pharma and travel support from the Elisabeth und Helmut Uhl Stiftung. UA has received fees as a speaker for Siemens Healthineers and as a clinical consultant for Bayer. AR lectures for Guerbet and Bayer and is part of the Advisory Board for GE, Bracco, and Guerbet. RS has received personal fees as a speaker or for serving on advisory boards from Angelini, Arvelle, Bial, Desitin, Eisai, Jazz Pharmaceuticals Germany GmbH, Janssen‐Cilag GmbH, LivaNova, LivAssured BV, Novartis, Precisis GmbH, Rapport Therapeutics, Tabuk Pharmaceuticals, UCB Pharma, UNEEG, and Zogenix. TR has received fees as a speaker from Eisai. None of the previously mentioned activities were related to the content of this manuscript. The remaining authors have nothing to declare. We confirm that we have read the Journal's position on issues involved in ethical publication and affirm that this report is consistent with those guidelines.

Figures

**FIGURE 1**
Rating workflow. Raters were guided through the rating process with a specifically developed software based on an open‐source MRI viewer. After viewing the MRI data in their native diagnostic environment, they first had to choose between FCD and Healthy Control (I)*. If correctly choosing FCD, they proceeded with localizing the suspected lesion via a coordinate (red cross) and rectangular bounding box defining the ROI (yellow) (II). Afterward, raters were shown clinical information and could revise their detection (III). In the end, steps (II) and (III) were repeated for all misclassified FCD cases (IV). *In the original publication, the cohort included healthy controls, while the here analyzed subset does not.

**FIGURE 2**
Tradeoff between TPR and PPV for pinpointed lesions across different rater specializations. Shaded area represents the 95% confidence interval.

**FIGURE 3**
Two examples of rater predictions. One has a detection rate of 96% (A) and the other of 14% (B). Groundtruth lesion mask is denoted by the green rectangle (top), rater bounding boxes/ROI predictions relative to the lesion in light blue (middle) and across the whole brain in various colors (bottom). The annotation may appear non‐rectangular due to registration to MNI space for the purpose of visualization.

See this image and copyright information in PMC

References

1. Blumcke I, Spreafico R, Haaker G, Coras R, Kobow K, Bien CG, et al. Histopathological findings in brain tissue obtained during epilepsy surgery. N Engl J Med. 2017;377(17):1648–1656. 10.1056/NEJMoa1703784 - DOI - PubMed
1. Lamberink HJ, Otte WM, Blümcke I, Braun KPJ, Aichholzer M, Amorim I, et al. Seizure outcome and use of antiepileptic drugs after epilepsy surgery according to histopathological diagnosis: a retrospective multicentre cohort study. Lancet Neurol. 2020;19(9):748–757. 10.1016/S1474-4422(20)30220-9 - DOI - PubMed
1. Wagstyl K, Whitaker K, Raznahan A, Seidlitz J, Vértes PE, Foldes S, et al. Atlas of lesion locations and postsurgical seizure freedom in focal cortical dysplasia: a MELD study. Epilepsia. 2022;63(1):61–74. 10.1111/epi.17130 - DOI - PMC - PubMed
1. Téllez‐Zenteno JF, Hernández Ronquillo L, Moien‐Afshari F, Wiebe S. Surgical outcomes in Lesional and non‐Lesional epilepsy: a systematic review and meta‐analysis. Epilepsy Res. 2010;89(2):310–318. 10.1016/j.eplepsyres.2010.02.007 - DOI - PubMed
1. Timoney N, Rutka JT. Recent advances in epilepsy surgery and achieving best outcomes using high‐frequency oscillations, diffusion tensor imaging, magnetoencephalography, intraoperative Neuromonitoring, focal cortical dysplasia, and bottom of sulcus dysplasia. Neurosurgery. 2017;64:1–10. 10.1093/neuros/nyx239 - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A public benchmark for human performance in the detection of focal cortical dysplasia

Affiliations

A public benchmark for human performance in the detection of focal cortical dysplasia

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources