. 2019 Aug 8;9(1):11538.

doi: 10.1038/s41598-019-47795-0.

Comparison of Two-Talker Attention Decoding from EEG with Nonlinear Neural Networks and Linear Methods

Affiliations

¹ Bioengineering Systems and Technologies Group, MIT Lincoln Laboratory, Lexington, MA, USA.
² Speech and Hearing Bioscience and Technology, Harvard Medical School, Boston, MA, USA.
³ Department of Electrical Engineering, Columbia University, New York, NY, USA.
⁴ Bioengineering Systems and Technologies Group, MIT Lincoln Laboratory, Lexington, MA, USA. christopher.smalt@ll.mit.edu.

PMID: 31395905
PMCID: PMC6687829
DOI: 10.1038/s41598-019-47795-0

Comparison of Two-Talker Attention Decoding from EEG with Nonlinear Neural Networks and Linear Methods

Gregory Ciccarelli et al. Sci Rep. 2019.

. 2019 Aug 8;9(1):11538.

doi: 10.1038/s41598-019-47795-0.

Authors

Affiliations

¹ Bioengineering Systems and Technologies Group, MIT Lincoln Laboratory, Lexington, MA, USA.
² Speech and Hearing Bioscience and Technology, Harvard Medical School, Boston, MA, USA.
³ Department of Electrical Engineering, Columbia University, New York, NY, USA.
⁴ Bioengineering Systems and Technologies Group, MIT Lincoln Laboratory, Lexington, MA, USA. christopher.smalt@ll.mit.edu.

PMID: 31395905
PMCID: PMC6687829
DOI: 10.1038/s41598-019-47795-0

Abstract

Auditory attention decoding (AAD) through a brain-computer interface has had a flowering of developments since it was first introduced by Mesgarani and Chang (2012) using electrocorticograph recordings. AAD has been pursued for its potential application to hearing-aid design in which an attention-guided algorithm selects, from multiple competing acoustic sources, which should be enhanced for the listener and which should be suppressed. Traditionally, researchers have separated the AAD problem into two stages: reconstruction of a representation of the attended audio from neural signals, followed by determining the similarity between the candidate audio streams and the reconstruction. Here, we compare the traditional two-stage approach with a novel neural-network architecture that subsumes the explicit similarity step. We compare this new architecture against linear and non-linear (neural-network) baselines using both wet and dry electroencephalogram (EEG) systems. Our results indicate that the new architecture outperforms the baseline linear stimulus-reconstruction method, improving decoding accuracy from 66% to 81% using wet EEG and from 59% to 87% for dry EEG. Also of note was the finding that the dry EEG system can deliver comparable or even better results than the wet, despite the latter having one third as many EEG channels as the former. The 11-subject, wet-electrode AAD dataset for two competing, co-located talkers, the 11-subject, dry-electrode AAD dataset, and our software are available for further validation, experimentation, and modification.

PubMed Disclaimer

Conflict of interest statement

All authors are part of a provisional patent application on the end-to-end, deep neural network auditory attention decoding algorithm described in this work. NM and J O’S are inventors on submitted patent WO2017218492A1 which covers neural decoding of auditory attention.

Figures

**Figure 1**
System architecture for auditory attention decoding: backward model. The temporal response function (TRF) can be linear or non-linear (neural network, see Fig. 3).

**Figure 2**
System architecture for auditory attention decoding: DNN binary classification. See Fig. 4 for a specific instance of the DNN.

**Figure 3**
The neural network architecture for stimulus reconstruction, based on the design in de Taillez *et al*.. There is one hidden layer with two nodes (FC₁) to enforce significant compression of EEG data before being transformed to a predicted audio stimulus (see Fig. 1 for the system architecture). BN = batch normalization, FC = fully connected.

**Figure 4**
The convolutional architecture used for integrated similarity computation between EEG and a candidate audio stream. Components include batch normalization (BN), convolution layers (Conv_i), exponential linear units (ELU), drop-outs (DO), and fully connected layers (FC_i). Wet EEG (kernel, num ch in, num ch out): Conv₁: 3 × 65 × 64, Conv₂: 1 × 64 × 2, Dry EEG: Conv₁: 3 × 19 × 19, Conv₂: 1 × 19 × 2, Both: FC₁: 246 × 200, FC₂: 200 × 200, FC₃: 200 × 100, FC₄:100 × 1, MaxPool 1D, stride:2. See Fig. 2 for the system architecture.

**Figure 5**
Per-subject attention-decoding accuracy using a wet, 64 channel EEG system. 10-second evaluation window, three algorithms: linear stimulus reconstruction (LSQ), non-linear stimulus reconstruction (DNN Corr.), and DNN classification (DNN Clf.). Chance performance is indicated by the black stars.

**Figure 6**
Per-subject attention-decoding accuracy using a dry EEG system. 10-second evaluation window, three algorithms: linear stimulus reconstruction (LSQ), non-linear stimulus reconstruction (DNN Corr.), and DNN classification (DNN Clf.). Chance performance is indicated by the black stars.

**Figure 7**
Normalized grand average headmaps of the LSQ TRF values across subjects.

**Figure 8**
Normalized grand average headmaps of the mean convolutional weights for the wet, wet sub-sampled, and dry EEG systems for the DNN classifier network. The colors are scaled between 0 and 1. The audio channel weights (not shown) were 1.7, 1.1, and 0.77 for the wet, wet sub-sampled, and dry systems, respectively.

See this image and copyright information in PMC

References

1. Wilson BS, Tucci DL, Merson MH, O’Donoghue GM. Global hearing health care: New findings and perspectives. The Lancet. 2017;390(10111):2503–2515. doi: 10.1016/S0140-6736(17)31073-5. - DOI - PubMed
1. USVA. Annual Benefits Report Fiscal Year 2017. US Department of Veterans Affairs, Veterans Benefits Administration (2017).
1. Kochkin S. Customer satisfaction with hearing instruments in the digital age. The Hearing Journal. 2005;58(9):30–43. doi: 10.1097/01.HJ.0000286545.33961.e7. - DOI
1. Abrams H, Kihm J. An introduction to MarkeTrak IX: A new baseline for the hearing aid market. Hearing Review. 2015;22(6):16.
1. Lesica NA. Why Do Hearing Aids Fail to Restore Normal Auditory Perception? Trends in Neurosciences. 2018;41(4):174–185. doi: 10.1016/j.tins.2018.01.008. - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

T32 DC000038/DC/NIDCD NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Comparison of Two-Talker Attention Decoding from EEG with Nonlinear Neural Networks and Linear Methods

Affiliations

Comparison of Two-Talker Attention Decoding from EEG with Nonlinear Neural Networks and Linear Methods

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources