Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb 27:12:97.
doi: 10.3389/fnins.2018.00097. eCollection 2018.

The Harvard Automated Processing Pipeline for Electroencephalography (HAPPE): Standardized Processing Software for Developmental and High-Artifact Data

Affiliations

The Harvard Automated Processing Pipeline for Electroencephalography (HAPPE): Standardized Processing Software for Developmental and High-Artifact Data

Laurel J Gabard-Durnam et al. Front Neurosci. .

Abstract

Electroenchephalography (EEG) recordings collected with developmental populations present particular challenges from a data processing perspective. These EEGs have a high degree of artifact contamination and often short recording lengths. As both sample sizes and EEG channel densities increase, traditional processing approaches like manual data rejection are becoming unsustainable. Moreover, such subjective approaches preclude standardized metrics of data quality, despite the heightened importance of such measures for EEGs with high rates of initial artifact contamination. There is presently a paucity of automated resources for processing these EEG data and no consistent reporting of data quality measures. To address these challenges, we propose the Harvard Automated Processing Pipeline for EEG (HAPPE) as a standardized, automated pipeline compatible with EEG recordings of variable lengths and artifact contamination levels, including high-artifact and short EEG recordings from young children or those with neurodevelopmental disorders. HAPPE processes event-related and resting-state EEG data from raw files through a series of filtering, artifact rejection, and re-referencing steps to processed EEG suitable for time-frequency-domain analyses. HAPPE also includes a post-processing report of data quality metrics to facilitate the evaluation and reporting of data quality in a standardized manner. Here, we describe each processing step in HAPPE, perform an example analysis with EEG files we have made freely available, and show that HAPPE outperforms seven alternative, widely-used processing approaches. HAPPE removes more artifact than all alternative approaches while simultaneously preserving greater or equivalent amounts of EEG signal in almost all instances. We also provide distributions of HAPPE's data quality metrics in an 867 file dataset as a reference distribution and in support of HAPPE's performance across EEG data with variable artifact contamination and recording lengths. HAPPE software is freely available under the terms of the GNU General Public License at https://github.com/lcnhappe/happe.

Keywords: EEG; EEG processing; artifact removal; automated; data quality; development; electroencephalography; pipeline.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic illustrating the HAPPE pipeline's processing steps. The intermediate output EEG files are indicated by the suffix added after that specific processing step in the blue boxes. The user options for segmentation steps and visualizing several steps in HAPPE with the semi-automated setting are also indicated. Independent component analysis is abbreviated to ICA.
Figure 2
Figure 2
Percent good channels retained. The distribution of the percent of channels retained as good channels during channel rejection is shown as a function of age when the EEG was acquired (A), or clinical group status (B) for a developmental sample.
Figure 3
Figure 3
Percent of independent components (ICs) rejected. The distribution of the percent of independent components rejected by MARA after ICA decomposition is shown as a function of age when the EEG was acquired (A), or clinical group status (B) for a developmental sample.
Figure 4
Figure 4
Relation between independent component (IC) rejection and the percent of data variance retained. The relation between the percent of variance in the EEG retained after MARA rejection of ICs (x-axis) and the percent of ICs rejected by MARA (y-axis) is shown for a developmental sample. The distributions for each metric in the same sample are shown opposite the labeled axes; top distribution is for percent of variance retained, right distribution is for percent of ICs rejected.
Figure 5
Figure 5
Percent variance retained post-MARA rejection. The distribution of the percent of variance in the EEG signal retained after MARA rejection of independent components is shown as a function of age when the EEG was acquired (A), or clinical group status (B) for a developmental sample.
Figure 6
Figure 6
Median artifact probability of retained EEG. The distribution of the median artifact probability value for retained independent components post-MARA rejection is shown as a function of age when the EEG was acquired (A), or clinical group status (B) for a developmental sample.
Figure 7
Figure 7
Mean artifact probability of retained EEG. The distribution of the mean (average) artifact probability value for the retained independent components post-MARA rejection is shown as a function of age when the EEG was acquired (A), or clinical group status (B) for a developmental sample.
Figure 8
Figure 8
EEG signal before and after HAPPE processing. Three files from the example dataset are shown (A-C) with 14 s of data extracted from the first 30 s of the recording. The EEG signal after minimal processing (i.e., filtering, channel subset selection, and average re-referencing) is shown in the left panel. The EEG signal after HAPPE processing as described in the example analysis results section is shown in the right panel. All scales are in microvolts.
Figure 9
Figure 9
Results from HAPPE processing steps and comparison to alternative approaches. For each file in the example dataset, the EEG power (y-axis, in microvolts squared) across a range of EEG signal frequencies (x-axis) is shown as a function of several processing steps within HAPPE. Power spectrums are generated after the filtering step (filter), after basic preprocessing (filter, CleanLine, bad channel rejection), wavelet-enhanced independent component analysis (wavelet thresholding), independent component analysis with MARA rejection (ICA with MARA rejection), segment rejection for the retained data (segment rejection), and after the final channel interpolation and re-referencing steps (fully-processed) (A). All 8 approaches for artifact rejection are compared in terms of the percent EEG data variance retained (x-axis) and the average artifact level in the retained EEG data (y-axis), where optimal performance would place an approach near the bottom right corner of the chart, retaining most of the EEG variance with low levels of artifact (B).
Figure 10
Figure 10
Schematic illustrating the HAPPE pipeline in relation to seven alternative processing approaches. Processing steps that are consistent across approaches, and implemented in HAPPE, are highlighted in light green. Processing steps that are unique to the alternative approaches are highlighted in light blue. Independent component analysis is abbreviated to ICA. Multiple Artifact Rejection Algorithm is abbreviated to MARA. Wavelet-thresholded ICA is abbreviated to W-ICA. Artifact Subspace Reconstruction is abbreviated to ASR. Fully Automated Statistical Thresholding for EEG artifact Rejection is abbreviated to FASTER. Automatic EEG artifact Detector based on the Joint Use of Spatial and Temporal features is abbreviated to ADJUST. SemiAutomated Selection of Independent Components for Artifact Correction in the EEG is abbreviated to SASICA.

References

    1. Acunzo D., MacKenzie G., Van Rossum M. (2012). Systematic biases in early ERP and ERF components as a result of high-pass filtering. J. Neurosci. Methods 209, 212–218. 10.1016/j.jneumeth.2012.06.011 - DOI - PubMed
    1. Albera L., Kachenoura A., Comon P., Karfoul A., Wendling F., Senhadji L., et al. (2012). ICA-based EEG denoising: a comparative analysis of fifteen methods. Bull. Polish Acad. Sci. Tech. Sci. 60, 407-418. 10.2478/v10175-012-0052-3 - DOI
    1. Al-Qazzaz N., Hamid Bin Mohd Ali S., Ahmad S., Islam M., Escudero J. (2017). Automatic artifact removal in EEG of normal and demented individuals using ICA–WT during working memory tasks. Sensors 17:1326. 10.3390/s17061326 - DOI - PMC - PubMed
    1. Bigdely-Shamlo N., Mullen T., Kothe C., Su K.-M., Robbins K. A. (2015). The PREP pipeline: standardized preprocessing for large-scale EEG analysis. Front. Neuroinform. 9:16. 10.3389/fninf.2015.00016 - DOI - PMC - PubMed
    1. Castellanos N. P., Makarov V. A. (2006). Recovering EEG brain signals: artifact suppression with wavelet enhanced independent component analysis. J. Neurosci. Methods 158, 300–312. 10.1016/j.jneumeth.2006.05.033 - DOI - PubMed