Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov;68(11):3205-3216.
doi: 10.1109/TBME.2021.3062502. Epub 2021 Oct 19.

Evaluating Performance of EEG Data-Driven Machine Learning for Traumatic Brain Injury Classification

Evaluating Performance of EEG Data-Driven Machine Learning for Traumatic Brain Injury Classification

Nicolas Vivaldi et al. IEEE Trans Biomed Eng. 2021 Nov.

Abstract

Objectives: Big data analytics can potentially benefit the assessment and management of complex neurological conditions by extracting information that is difficult to identify manually. In this study, we evaluated the performance of commonly used supervised machine learning algorithms in the classification of patients with traumatic brain injury (TBI) history from those with stroke history and/or normal EEG.

Methods: Support vector machine (SVM) and K-nearest neighbors (KNN) models were generated with a diverse feature set from Temple EEG Corpus for both two-class classification of patients with TBI history from normal subjects and three-class classification of TBI, stroke and normal subjects.

Results: For two-class classification, an accuracy of 0.94 was achieved in 10-fold cross validation (CV), and 0.76 in independent validation (IV). For three-class classification, 0.85 and 0.71 accuracy were reached in CV and IV respectively. Overall, linear discriminant analysis (LDA) feature selection and SVM models consistently performed well in both CV and IV and for both two-class and three-class classification. Compared to normal control, both TBI and stroke patients showed an overall reduction in coherence and relative PSD in delta frequency, and an increase in higher frequency (alpha, mu, beta and gamma) power. But stroke patients showed a greater degree of change and had additional global decrease in theta power.

Conclusions: Our study suggests that EEG data-driven machine learning can be a useful tool for TBI classification.

Significance: Our study provides preliminary evidence that EEG ML algorithm can potentially provide specificity to separate different neurological conditions.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Flowchart describing data processing and model training. (ICA: independent component analysis, PSD: power spectral density, SE: spectral entropy, PAC: phase-amplitude coupling, LDA: linear discriminant analysis, PCA: principal component analysis, FSFS: forward sequential feature selection, BSFS: backwards sequential feature selection).
Fig. 2.
Fig. 2.
Performance of models trained with features selected by linear discriminant analysis (LDA). The figure shows distribution of accuracies for 1000 iterations of training using randomly labeled data (orange), true labeled data (blue), and independent dataset (green line) for models based on features selected by LDA. Black and green dotted lines show ZeroR benchmarks for cross-validation (CV) and independent validation (IV) respectively. All models trained with true labeled data performed significantly better than randomly labeled data at 10−10 confidence interval in 10-fold CV, except for SVM fine Gaussian models (two sample K-S test). (SVM: support vector machine, KNN: K-nearest neighbors).
Fig. 3.
Fig. 3.
Comparison of performance of models for classifying patients with TBI history from normal subjects. (a) and (b) show boxplots of accuracy of models trained with features selected by different methods. Accuracy was evaluated with 10-fold CV and independent dataset respectively. Models trained with PCA selected features performed worst in both 10-fold CV and independent validation, while those trained with features selected by LDA performed best. Models trained with features selected by Statistics performed inferior to those with LDA in CV, however, their performance was comparable with LDA models when used to classify independent dataset. (c) and (d) compare the accuracy of models trained with input features including demographic information and those without demographic information (Demo: demographic). The majority models with demographic inputs appear to perform better than their counterparts. (e) and (f) compare the performance of models trained with features generated from artifact removed clean EEG data versus those from raw EEG. Though variability is present, most models trained with raw data performed slightly better than the corresponding clean data in CV. And performance of models from raw data was comparable with those from clean data for predicting independent dataset. Each line in (c) to (f) represents each algorithm. Dark lines in (c) and (e) indicate significant difference in two sample K-S test at 10−10 significance level. (∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001, One way ANOVA and post-hoc Tukey test in (a) and (b), Signed-rank test in (c) to (f).) (CV: cross-validation, Stats: statistics, LDA: linear discriminant analysis, FSFS: forward sequential feature selection, PCA: principal component analysis, BSFS: backwards sequential feature selection.).
Fig. 4.
Fig. 4.
Comparison of performance of models for 3-class classification. (a) and (b) show boxplots of accuracy of models trained with features selected by different methods for classifying patients with TBI and stroke history and normal subjects. Accuracy was evaluated with 10-fold CV and independent dataset respectively. Models trained with BSFS and LDA selected features performed best in 10-fold CV. In IV, models trained with features selected by LDA, statistics, and FSFS showed comparable performance. (c) and (d) compare the accuracy of models trained with input features including demographic information and those without demographic information (Demo: demographic). The majority models with demographic inputs appear to perform better than their counterparts. (e) and (f) compare the performance of models trained with features generated from artifact removed clean EEG data versus those from raw EEG. The majority models built upon clean data performed significantly better than raw data. (∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001, One way ANOVA and post-hoc Tukey test in (a) and (b).) (CV: cross-validation, Stats: statistics, LDA: linear discriminant analysis, FSFS: forward sequential feature selection, PCA: principal component analysis, BSFS: backwards sequential feature selection).
Fig. 5.
Fig. 5.
Relationship between IV and CV accuracies for two-class (a) and three-class (b) classification. The respective ZeroR benchmarks for CV and IV are shown as black lines. (CV: cross-validation, IV: independent validation, LDA: linear discriminant analysis, FSFS: forward sequential feature selection, PCA: principal component analysis, Stats: statistics, BSFS: backwards sequential feature selection.).
Fig. 6.
Fig. 6.
Changes in clean EEG features. (a) shows the fraction of features selected by statistics, LDA, and FSFS out of total number of features in each type of features (i.e., 171 coherence and 19 relative PSD features in each frequency band) without consideration of channels for 2-class classification. (b) shows the fraction of selected features for 3-class classification. (c) shows the median z score for each type of features in normal, TBI and stroke subjects respectively. (d) shows the broadband coherence change from normal to TBI. Main panel shows the median z score of coherence coefficients of all channel pairs. Inset demonstrates the channel pairs with median z score higher than 0.5 or lower than −0.5. (e) shows the topographic map of relative PSD based on z scores. (f) indicates the z score of stroke broadband coherence to TBI. Inset shows the channel pairs with median z score higher than 0.5 or lower than −0.5. (g) shows the topographic map of relative PSD z score of stroke subjects to TBI. (LDA: linear discriminant analysis, FSFS: forward sequential feature selection, Stats: statistics, PAC: phase-amplitude coupling, PSD: power spectral density).

References

    1. Schmitt S and Dichter MA, “Electrophysiologic recordings in traumatic brain injury,” Handbook Clin. Neurol, vol. 127, pp. 319–339, 2015. - PubMed
    1. Rapp PE et al., “Traumatic brain injury detection using electrophysiological methods,” Front. Hum. Neurosci, vol. 9, pp. 11. Feb. 2015. - PMC - PubMed
    1. Nuwer MR et al., “Routine and quantitative EEG in mild traumatic brain injury,” Clin. Neurophysiol, vol. 116, no. 9, pp. 2001–2025, Sep. 2005. - PubMed
    1. Arciniegas DB, “Clinical electrophysiologic assessments and mild traumatic brain injury: State-of-the-science and implications for clinical practice,” Int. J. Psychophysiol, vol. 82, no. 1, pp. 41–52, Oct. 2011. - PubMed
    1. Lewine JD et al., “Quantitative EEG biomarkers for mild traumatic brain injury,” J. Clin. Neurophysiol, vol. 36, no. 4, pp. 298–305, Jul. 2019. - PubMed

Publication types