Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Aug 1:1644:203-12.
doi: 10.1016/j.brainres.2016.05.029. Epub 2016 May 16.

Attention selectively modulates cortical entrainment in different regions of the speech spectrum

Affiliations

Attention selectively modulates cortical entrainment in different regions of the speech spectrum

Lucas S Baltzell et al. Brain Res. .

Abstract

Recent studies have uncovered a neural response that appears to track the envelope of speech, and have shown that this tracking process is mediated by attention. It has been argued that this tracking reflects a process of phase-locking to the fluctuations of stimulus energy, ensuring that this energy arrives during periods of high neuronal excitability. Because all acoustic stimuli are decomposed into spectral channels at the cochlea, and this spectral decomposition is maintained along the ascending auditory pathway and into auditory cortex, we hypothesized that the overall stimulus envelope is not as relevant to cortical processing as the individual frequency channels; attention may be mediating envelope tracking differentially across these spectral channels. To test this we reanalyzed data reported by Horton et al. (2013), where high-density EEG was recorded while adults attended to one of two competing naturalistic speech streams. In order to simulate cochlear filtering, the stimuli were passed through a gammatone filterbank, and temporal envelopes were extracted at each filter output. Following Horton et al. (2013), the attended and unattended envelopes were cross-correlated with the EEG, and local maxima were extracted at three different latency ranges corresponding to distinct peaks in the cross-correlation function (N1, P2, and N2). We found that the ratio between the attended and unattended cross-correlation functions varied across frequency channels in the N1 latency range, consistent with the hypothesis that attention differentially modulates envelope-tracking activity across spectral channels.

Keywords: Attention; EEG; Entrainment; Speech envelopes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Cross-correlations between original speech envelopes and EEG activity at 128 recording channels. While we computed cross-correlations with delays from -1000 to +1000 ms, we show only -200 to +600 ms for viewing convenience, and because no significant peaks exist outside of this range. These cross-correlations generate temporal response functions that recover the N1-P2-N2 auditory evoked response, and while the response is clearest in the attended temporal response function, this pattern can also be observed in the unattended temporal response function.
Figure 2
Figure 2
The frequency response functions of the gammatone filters used in the experiment. On a linear frequency axis, bandwidths increase with increasing center frequency. The matlab code used to generate the gammatone filter coefficients was derived from Slaney (1993).
Figure 3
Figure 3
Scalp topographies for cross-correlation values at all recording sites averaged over latency ranges corresponding to N1, P2, and N2. For each range, a clear anterior-posterior dipole is observed.
Figure 4
Figure 4
[A] Cross-correlation maxima as a function of gammatone filter center frequency for latencies corresponding to the N1 (90 ± 25 ms), P2 (200 ± 25 ms) and N2 (350 ± 25 ms) peaks. The noise floor (gray) shows the range of correlation values that would occur by chance if the stimulus envelope is unrelated to the EEG. [B] The data-reduced version of [A], collapsed into low (100 – 338 Hz), mid (430 – 1452 Hz), and high (1851 – 6246 Hz) frequency regions.
Figure 5
Figure 5
[A] Log-ratios between attended and unattended cross-correlation maxima as a function of frequency region (Low: 100–338 Hz; Mid: 430–1452 Hz; High: 1851–6246 Hz) in the N1 latency range. The solid line indicates the grand average, and each individual dotted line represents an individual subject. On the right of this plot is a bar graph showing the outcome of paired-comparison post-hoc tests. Error bars represent standard errors of them mean. [B] Same as [A] but for the N2 latency range.
Figure 6
Figure 6
[A] Total power in each gammatone filter. [B] Correlation (normalized covariance) matrix for the same data. [C] Correlations between the envelope at the output of each gammatone filter with the original (full-band) stimulus envelope. This stimulus-to-stimulus correlation function can be thought of as the shape of the expected stimulus envelope-to-EEG cross-correlation by frequency function if the full-band stimulus envelope were being entrained.
Figure 7
Figure 7
Cross-correlation functions between original speech envelopes and EEG activity at 128 recording channels for four representative frequency channels (CF) that span the range of CFs included in our analysis. Notice that both the attended and unattended cross-correlation functions show significant structure in the ∼65-365 ms latency range, while the control cross-correlation functions do not.

References

    1. Ahissar E, Nagarajan S, Ahissar M, Protopapas A, Mahncke H, Merzenich MM. Speech comprehension is correlated with temporal response patterns recorde from auditory cortex. P Natl Acad Sci USA. 2001;98:13367–13372. - PMC - PubMed
    1. ANSI S3.5. Methods for the calculation of the speech intelligibility index. American National Standards Institute 1997
    1. Delorme A, Makeig S. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J Neurosci Methods. 2004;134:9–21. - PubMed
    1. Di Liberto GM, O'Sullivan J, Lalor E. Low-frequency cortical entrainment to speech reflects phoneme-level processing. Cur Biol. 2015;25:2457–2465. - PubMed
    1. Ding N, Simon JZ. Neural coding of continuous speech in auditory cortex during monaural and dichotic listening. J Neurophysiol. 2012a;107:78–89. - PMC - PubMed

Publication types