Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul 15:175:176-187.
doi: 10.1016/j.neuroimage.2018.03.016. Epub 2018 Mar 9.

Applying dimension reduction to EEG data by Principal Component Analysis reduces the quality of its subsequent Independent Component decomposition

Affiliations

Applying dimension reduction to EEG data by Principal Component Analysis reduces the quality of its subsequent Independent Component decomposition

Fiorenzo Artoni et al. Neuroimage. .

Abstract

Independent Component Analysis (ICA) has proven to be an effective data driven method for analyzing EEG data, separating signals from temporally and functionally independent brain and non-brain source processes and thereby increasing their definition. Dimension reduction by Principal Component Analysis (PCA) has often been recommended before ICA decomposition of EEG data, both to minimize the amount of required data and computation time. Here we compared ICA decompositions of fourteen 72-channel single subject EEG data sets obtained (i) after applying preliminary dimension reduction by PCA, (ii) after applying no such dimension reduction, or else (iii) applying PCA only. Reducing the data rank by PCA (even to remove only 1% of data variance) adversely affected both the numbers of dipolar independent components (ICs) and their stability under repeated decomposition. For example, decomposing a principal subspace retaining 95% of original data variance reduced the mean number of recovered 'dipolar' ICs from 30 to 10 per data set and reduced median IC stability from 90% to 76%. PCA rank reduction also decreased the numbers of near-equivalent ICs across subjects. For instance, decomposing a principal subspace retaining 95% of data variance reduced the number of subjects represented in an IC cluster accounting for frontal midline theta activity from 11 to 5. PCA rank reduction also increased uncertainty in the equivalent dipole positions and spectra of the IC brain effective sources. These results suggest that when applying ICA decomposition to EEG data, PCA rank reduction should best be avoided.

Keywords: Dipolarity; Electroencephalogram, EEG; Independent component analysis, ICA; Principal component analysis, PCA; Reliability; Source localization.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Mean explained variance (blue line) in relation to the number of largest principal components (PCs) retained, including (A) or not including (B) the bipolar vertical and horizontal electro-oculographic channels (EOGv and EOGh). Panel C shows the average number of PCs necessary to explain at least 85%, 95%, 99% of original dataset variance, including (green) or not including (blue) the EOG.
Figure 2:
Figure 2:
For a representative subject, scalp maps of quasi-dipolar components (dipolarity above 85%) extracted by applying ICA (ICA-Only) or PCA (PCA-Only) directly to the data, or by performing ICA after reducing the original data rank by PCA so as to retain at least 85% (PCA85ICA, 4 ± 0.5 Median ± MAD PCs), 95% (PCA95ICA, 8 ± 2.5 PCs) and 99% (PCA99ICA, 21 ± 6 PCs) of data variance respectively. Components are sorted into identifiable non-brain Artifact and Brain ICs, separated by the vertical red dashed line. A dashed blue box highlights eye activity-related artifact ICs (vertical EOG and horizontal EOG ICs, respectively) in the PCA95ICA, PCA99ICA, and ICA-Only conditions.
Figure 3:
Figure 3:
Panels A and B: box plots of median numbers of ICs (#ICs) with dipolarity values (A) above 85% (quasi-dipolar) and (B) 95% (near-dipolar). Significance of differences between conditions was determined using Kruskal-Wallis plus Tuckey post hoc tests. Panel C: Estimated probabilities of significant condition differences in the number of quasi-dipolar components (RV > 85%) for the following comparisons: (i) PCA-Only versus PCA85ICA; (ii) PCA85ICA versus PCA95ICA; (iii) PCA95ICA versus PCA99ICA; (iv) PCA99ICA versus ICA-Only. Each panel shows p-values for existence of significant differences between the number of quasi-dipolar components in the contrasted condition pair for each dipolarity threshold (x axis, RV > 80% to RV>99%). Dashed red lines show the dipolarity condition-difference significance threshold (red dashed line at p=0.05). Panel D: Numbers of dipolar ICs (y axis) available after PCA dimensionality reduction for two dipolarity thresholds (dipolarity > 85%, >95%) in decomposition conditions PCA85ICA (black dots), PCA95ICA (green dots), PCA99ICA (blue dots), and ICA-only (red dots). A dashed blue line connects the dots for each subject. A red dashed line plots the #ICs (the upper bound to the #dipolar ICs).
Figure 4:
Figure 4:
Histograms of component dipolarities (across all 14 data sets) following preliminary PCA subspace restriction (to RV>85%, RV>95%, or RV>99%), without preliminary PCA (ICA-Only), or directly applying PCA (PCA-Only). The median of each distribution is indicated by a red vertical line (sk = skewness). Note the different y-axis scales.
Figure 5:
Figure 5:
IC clusters extracted by RELICA bootstrap decompositions for one subject, either following reduction of data rank to a principal subspace (PCA85ICA, PCA95ICA and PCA99ICA) or (lower right) without PCA-based rank reduction. Within each box, the ICs are clustered according to mutual similarity and cluster quality index (QIc) values are computed to measure their compactness. At far left and right, scalp maps of example components in clusters associated with left hand-area (8–12 Hz) mu rhythm activity, central posterior (8–12 Hz) alpha band activity, and eye blink artifact are shown and their QIc values are indicated. Note the stronger between-subject cluster definition and higher QIc values (reflecting more highly correlated time course) for the IC clusters without PCA processing (ICA-Only, lower right).
Figure 6:
Figure 6:
Distribution of IC QIc values across the subjects for different levels of principal subspace data variance retained (PCA85ICA, PCA95ICA, PCA99ICA) and for ICA-Only (100%). The median of each distribution is indicated by a red vertical line (med = median; sk = skewness). Bottom panel: Significance of pairwise differences between conditions, determined using a Kruskal-Wallis test with Tuckey post hoc correction for multiple comparisons correction (*** = p<.001).
Figure 7:
Figure 7:
The frontal midline theta (fMθ) cluster identified across subjects in each of the four decomposition conditions (PCA85ICA, PCA95ICA, PCA99ICA and ICA-Only) conditions. The picture shows the individual IC scalp maps (1st column), the cluster-mean maps (2nd column), IC equivalent dipole locations (3rd column – each dot represents one IC for one subject). The median absolute deviations (MAD; σx, σy, σZ in mm) of the cluster IC equivalent dipole positions are given. The 4th column shows cluster median power spectral densities (PSDs, with ± MAD shaded). σθ, the MAD of the PSD in the (4–8 Hz) theta band is also indicated.
Figure 8:
Figure 8:
Left mu clusters across all subjects for the PCA85ICA, PCA95ICA, PCA99ICA and ICA-Only decomposition pipelines. The picture shows the individual IC scalp maps (1st column), cluster mean scalp map (2nd column), IC equivalent dipole locations (3rd column – each dot represents an IC of one subject), and in the 4th column, the cluster median (± 9–11 Hz MAD) PSD. This is another example of the effects of PCA dimension reduction at the across-subjects cluster level (cf. Figure 7).

References

    1. Acar ZA, Acar CE, Makeig S, 2016. Simultaneous head tissue conductivity and EEG source location estimation. Neuroimage 124, 168–180. - PMC - PubMed
    1. Artoni F, Delorme A, Makeig S, 2018. A visual working memory dataset collection with bootstrap Independent Component Analysis for comparison of electroencephalogram preprocessing pipelines. Data In Brief Submitted. - PMC - PubMed
    1. Artoni F, Fanciullacci C, Bertolucci F, Panarese A, Makeig S, Micera S, Chisari C, 2017. Unidirectional brain to muscle connectivity reveals motor cortex control of leg muscles during stereotyped walking. Neuroimage 159, 403–416. - PMC - PubMed
    1. Artoni F, Gemignani A, Sebastiani L, Bedini R, Landi A, Menicucci D, 2012. ErpICASSO: a tool for reliability estimates of independent components in EEG event-related analysis. Conf Proc IEEE Eng Med Biol Soc 2012, 368–371. - PubMed
    1. Artoni F, Menicucci D, Delorme A, Makeig S, Micera S, 2014. RELICA: a method for estimating the reliability of independent components. Neuroimage 103, 391–400. - PMC - PubMed

Publication types