Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 10;8(1):1039.
doi: 10.1038/s42003-025-08464-3.

How EEG preprocessing shapes decoding performance

Affiliations

How EEG preprocessing shapes decoding performance

Roman Kessler et al. Commun Biol. .

Abstract

Electroencephalography (EEG) preprocessing varies widely between studies, but its impact on classification performance remains poorly understood. To address this gap, we analyzed seven experiments with 40 participants drawn from the public ERP CORE dataset. We systematically varied key preprocessing steps, such as filtering, referencing, baseline interval, detrending, and multiple artifact correction steps, all of which were implemented in MNE-Python. Then we performed trial-wise binary classification (i.e., decoding) using neural networks (EEGNet), or time-resolved logistic regressions. Our findings demonstrate that preprocessing choices influenced decoding performance considerably. All artifact correction steps reduced decoding performance across experiments and models, while higher high-pass filter cutoffs consistently increased decoding performance. For EEGNet, baseline correction further increased decoding performance, and for time-resolved classifiers, linear detrending, and lower low-pass filter cutoffs increased decoding performance. The influence of other preprocessing choices was specific for each experiment or event-related potential component. The current results underline the importance of carefully selecting preprocessing steps for EEG-based decoding. While uncorrected artifacts may increase decoding performance, this comes at the expense of interpretability and model validity, as the model may exploit structured noise rather than the neural signal.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. The multiverse of preprocessing choices.
By systematically varying each preprocessing step, the raw data was replicated 2592 times to be preprocessed in unique forking paths. The asterisks indicate processing choices of an example forking path used for some analyses, in which the N170 experiment was re-referenced to the average, and all other experiments were re-referenced to P9/P10.
Fig. 2
Fig. 2. Grand average evoked responses for each experiment.
The data from one example forking path (Fig. 1) were used. The dashed and dotted time courses represent the average responses to each stimulus category. The solid time course represents the difference between the respective categories, illustrating the respective ERP. Time series originate either from a single channel, or from two channels, in which case the mean was calculated. The channel positions are indicated by small graphical legends on the right of each plot. Dotted vertical lines indicate response onset (ERN, LRP), and dashed vertical lines indicate stimulus onset (other experiments). Note that the y-axes are scaled differently for each experiment.
Fig. 3
Fig. 3. Overview of decoding performances.
A EEGNet: (Balanced) Decoding accuracies (y-axis) are plotted for each forking path, averaged across participants, separately for each experiment (x-axis). B Time-resolved: T-sums (y-axis) are plotted for each forking path and across participants, separately for each experiment (x-axis). Triangles indicate the forking path without preprocessing. Boxes represent the interquartile range (25th to 75th percentile), with the median indicated by a solid black line. Whiskers extend to the most extreme values within 1.5 times the interquartile range from the lower and upper quartiles.
Fig. 4
Fig. 4. Time-resolved decoding accuracy for one forking path.
(Balanced) Decoding accuracies are illustrated on the y-axis for each time point (x-axis). The horizontal black line represents the chance level. One example forking path was used per experiment (Fig. 1). Decoding was performed within each participant, but the decoding time series were averaged across participants. Different experiments are separated in vertical panels. Dotted vertical lines indicate response onset (ERN, LRP), and dashed vertical lines indicate stimulus onset (remaining experiments). Permutation cluster mass tests with a family-wise error rate correction at α = 0.05 were performed for each experiment (shaded areas).
Fig. 5
Fig. 5. Influence of preprocessing steps on decoding performance.
Percentage deviation from marginal means of either decoding accuracy (EEGNet, A) or T-sum (time-resolved, B) are depicted within each tile. Marginal means for each level (x-axis) of preprocessing step (horizontal panels) are normalized to the mean of the respective experiment (vertical panels). Each tile therefore shows the percentage differences in relation to this mean value. Only steps with a significant F-test (p < 0.05) are colored. Color scales differ in A and B. Ocular: ocular artifact correction; muscle: muscle artifact correction; ICA: independent component analysis; low-pass: low-pass filter in Hertz; high-pass: high-pass filter in Hertz; baseline: baseline interval in milliseconds; autoreject version either interpolate (interp) or reject artifact-contaminated trials (reject).
Fig. 6
Fig. 6. Interactions between preprocessing steps on decoding performance for the N170 experiment and EEGNet decoding.
Horizontal and vertical panels illustrate the different preprocessing steps. The individual preprocessing choices are illustrated on the x-axes and color-coded. The color legends on the diagonal refer to each horizontal panel. (Balanced) Decoding accuracies are shown on the y-axes. Stars indicate the significance (‘*’p < 0.05; ‘**’p < 0.01; ‘***’p < 0.001). Ocular: ocular artifact correction; muscle: muscle artifact correction; ICA: independent component analysis; low-pass: low-pass filter in Hertz; high-pass: high-pass filter in Hertz; baseline: baseline interval in milliseconds; autoreject version either interpolate (interp) or reject artifact-contaminated trials (reject).

References

    1. Carrasco, C. D., Bahle, B., Simmons, A. M. & Luck, S. J. Using multivariate pattern analysis to increase effect sizes for event-related potential analyses. Psychophysiology61, e14570 (Wiley Online Library, 2024). - PMC - PubMed
    1. Kriegeskorte, N., Mur, M. & Bandettini, P. Representational similarity analysis – connecting the branches of systems neuroscience. Front. Syst. Neurosci. (2008) 10.3389/neuro.06.004.2008. - PMC - PubMed
    1. Haxby, J. V., Connolly, A. C. & Guntupalli, J. S. Decoding neural representational spaces using multivariate pattern analysis. Annu. Rev. Neurosci.37, 435–456 (2014). - PubMed
    1. Marsicano, G., Bertini, C. & Ronconi, L. Decoding cognition in neurodevelopmental, psychiatric and neurological conditions with multivariate pattern analysis of EEG data. Neurosci. Biobehav. Rev. 105795 (2024). - PubMed
    1. Watts, D. et al. Predicting treatment response using EEG in major depressive disorder: A machine-learning meta-analysis. Transl. Psychiatry12, 332 (2022). - PMC - PubMed