Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Aug 8;9(1):11538.
doi: 10.1038/s41598-019-47795-0.

Comparison of Two-Talker Attention Decoding from EEG with Nonlinear Neural Networks and Linear Methods

Affiliations

Comparison of Two-Talker Attention Decoding from EEG with Nonlinear Neural Networks and Linear Methods

Gregory Ciccarelli et al. Sci Rep. .

Abstract

Auditory attention decoding (AAD) through a brain-computer interface has had a flowering of developments since it was first introduced by Mesgarani and Chang (2012) using electrocorticograph recordings. AAD has been pursued for its potential application to hearing-aid design in which an attention-guided algorithm selects, from multiple competing acoustic sources, which should be enhanced for the listener and which should be suppressed. Traditionally, researchers have separated the AAD problem into two stages: reconstruction of a representation of the attended audio from neural signals, followed by determining the similarity between the candidate audio streams and the reconstruction. Here, we compare the traditional two-stage approach with a novel neural-network architecture that subsumes the explicit similarity step. We compare this new architecture against linear and non-linear (neural-network) baselines using both wet and dry electroencephalogram (EEG) systems. Our results indicate that the new architecture outperforms the baseline linear stimulus-reconstruction method, improving decoding accuracy from 66% to 81% using wet EEG and from 59% to 87% for dry EEG. Also of note was the finding that the dry EEG system can deliver comparable or even better results than the wet, despite the latter having one third as many EEG channels as the former. The 11-subject, wet-electrode AAD dataset for two competing, co-located talkers, the 11-subject, dry-electrode AAD dataset, and our software are available for further validation, experimentation, and modification.

PubMed Disclaimer

Conflict of interest statement

All authors are part of a provisional patent application on the end-to-end, deep neural network auditory attention decoding algorithm described in this work. NM and J O’S are inventors on submitted patent WO2017218492A1 which covers neural decoding of auditory attention.

Figures

Figure 1
Figure 1
System architecture for auditory attention decoding: backward model. The temporal response function (TRF) can be linear or non-linear (neural network, see Fig. 3).
Figure 2
Figure 2
System architecture for auditory attention decoding: DNN binary classification. See Fig. 4 for a specific instance of the DNN.
Figure 3
Figure 3
The neural network architecture for stimulus reconstruction, based on the design in de Taillez et al.. There is one hidden layer with two nodes (FC1) to enforce significant compression of EEG data before being transformed to a predicted audio stimulus (see Fig. 1 for the system architecture). BN = batch normalization, FC = fully connected.
Figure 4
Figure 4
The convolutional architecture used for integrated similarity computation between EEG and a candidate audio stream. Components include batch normalization (BN), convolution layers (Convi), exponential linear units (ELU), drop-outs (DO), and fully connected layers (FCi). Wet EEG (kernel, num ch in, num ch out): Conv1: 3 × 65 × 64, Conv2: 1 × 64 × 2, Dry EEG: Conv1: 3 × 19 × 19, Conv2: 1 × 19 × 2, Both: FC1: 246 × 200, FC2: 200 × 200, FC3: 200 × 100, FC4:100 × 1, MaxPool 1D, stride:2. See Fig. 2 for the system architecture.
Figure 5
Figure 5
Per-subject attention-decoding accuracy using a wet, 64 channel EEG system. 10-second evaluation window, three algorithms: linear stimulus reconstruction (LSQ), non-linear stimulus reconstruction (DNN Corr.), and DNN classification (DNN Clf.). Chance performance is indicated by the black stars.
Figure 6
Figure 6
Per-subject attention-decoding accuracy using a dry EEG system. 10-second evaluation window, three algorithms: linear stimulus reconstruction (LSQ), non-linear stimulus reconstruction (DNN Corr.), and DNN classification (DNN Clf.). Chance performance is indicated by the black stars.
Figure 7
Figure 7
Normalized grand average headmaps of the LSQ TRF values across subjects.
Figure 8
Figure 8
Normalized grand average headmaps of the mean convolutional weights for the wet, wet sub-sampled, and dry EEG systems for the DNN classifier network. The colors are scaled between 0 and 1. The audio channel weights (not shown) were 1.7, 1.1, and 0.77 for the wet, wet sub-sampled, and dry systems, respectively.

References

    1. Wilson BS, Tucci DL, Merson MH, O’Donoghue GM. Global hearing health care: New findings and perspectives. The Lancet. 2017;390(10111):2503–2515. doi: 10.1016/S0140-6736(17)31073-5. - DOI - PubMed
    1. USVA. Annual Benefits Report Fiscal Year 2017. US Department of Veterans Affairs, Veterans Benefits Administration (2017).
    1. Kochkin S. Customer satisfaction with hearing instruments in the digital age. The Hearing Journal. 2005;58(9):30–43. doi: 10.1097/01.HJ.0000286545.33961.e7. - DOI
    1. Abrams H, Kihm J. An introduction to MarkeTrak IX: A new baseline for the hearing aid market. Hearing Review. 2015;22(6):16.
    1. Lesica NA. Why Do Hearing Aids Fail to Restore Normal Auditory Perception? Trends in Neurosciences. 2018;41(4):174–185. doi: 10.1016/j.tins.2018.01.008. - DOI - PMC - PubMed

Publication types