Cortical responses time-locked to continuous speech in the high-gamma band depend on selective attention

doi:10.3389/fnins.2023.1264453

. 2023 Dec 14:17:1264453.

doi: 10.3389/fnins.2023.1264453. eCollection 2023.

Cortical responses time-locked to continuous speech in the high-gamma band depend on selective attention

Vrishab Commuri¹, Joshua P Kulasingham², Jonathan Z Simon^{1

3

4}

Affiliations

¹ Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, United States.
² Department of Electrical Engineering, Linköping University, Linköping, Sweden.
³ Department of Biology, University of Maryland, College Park, MD, United States.
⁴ Institute for Systems Research, University of Maryland, College Park, MD, United States.

PMID: 38156264
PMCID: PMC10752935
DOI: 10.3389/fnins.2023.1264453

Cortical responses time-locked to continuous speech in the high-gamma band depend on selective attention

Vrishab Commuri et al. Front Neurosci. 2023.

. 2023 Dec 14:17:1264453.

doi: 10.3389/fnins.2023.1264453. eCollection 2023.

Authors

Vrishab Commuri¹, Joshua P Kulasingham², Jonathan Z Simon^{1

3

4}

Affiliations

¹ Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, United States.
² Department of Electrical Engineering, Linköping University, Linköping, Sweden.
³ Department of Biology, University of Maryland, College Park, MD, United States.
⁴ Institute for Systems Research, University of Maryland, College Park, MD, United States.

PMID: 38156264
PMCID: PMC10752935
DOI: 10.3389/fnins.2023.1264453

Abstract

Auditory cortical responses to speech obtained by magnetoencephalography (MEG) show robust speech tracking to the speaker's fundamental frequency in the high-gamma band (70-200 Hz), but little is currently known about whether such responses depend on the focus of selective attention. In this study 22 human subjects listened to concurrent, fixed-rate, speech from male and female speakers, and were asked to selectively attend to one speaker at a time, while their neural responses were recorded with MEG. The male speaker's pitch range coincided with the lower range of the high-gamma band, whereas the female speaker's higher pitch range had much less overlap, and only at the upper end of the high-gamma band. Neural responses were analyzed using the temporal response function (TRF) framework. As expected, the responses demonstrate robust speech tracking of the fundamental frequency in the high-gamma band, but only to the male's speech, with a peak latency of ~40 ms. Critically, the response magnitude depends on selective attention: the response to the male speech is significantly greater when male speech is attended than when it is not attended, under acoustically identical conditions. This is a clear demonstration that even very early cortical auditory responses are influenced by top-down, cognitive, neural processing mechanisms.

Keywords: cocktail party; cortical FFR; phase-locked response; primary auditory cortex; speech tracking.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
Illustration of how the carrier and envelope modulations predictors are extracted from an auditory stimulus. The raw stimulus waveform is shown in the bottom-left corner. **Envelope modulations predictor:** to generate the envelope modulations predictor, starting with the raw waveform and following the arrows up and to the right, first an auditory spectrogram is generated using a model of the auditory periphery (Yang et al., 1992). Then, the acoustic envelope in each frequency bin in the range 300–4,000 Hz is bandpassed in the high-gamma range (70–200 Hz), and the average is then computed across the channels. The result is a single time-series signal. **Carrier predictor:** to generate the carrier predictor, following the arrows to the right, the raw stimulus waveform is simply bandpass filtered to the high-gamma range. The result is a second single time-series signal. [Figure reproduced with permission from Kulasingham et al. (2020)].

**Figure 2**
Prediction accuracies for male single-speaker **(Top)** and cocktail-party **(Bottom)** models. Red regions denote voxels where the TRF model produced a prediction accuracy that was significantly greater than that of the noise within the ROI. TRFs to female speech (not shown) did not produce significant responses in any voxels.

**Figure 3**
Comparison of male speech and female speech TRFs for the single speaker conditions. Solid black lines indicate the TRF grand average (over TRF amplitude, averaged across voxels in the ROI); shaded regions indicate values within one standard error of the mean. Red shading indicates TRF values significantly above the noise floor. The distribution of TRF vectors in the brain at the time with the maximum significant response is plotted as an inset for each TRF. **(Top left)** Average TRF of the envelope modulations predictor derived from the male speaker stimulus. Note the large significant response at ~30–50 ms in the TRF which indicates a consistent, time-locked neural response to the speech envelope modulations at a 30–50 ms latency. **(Top right)** Average TRF of the envelope modulations predictor derived from the female speaker stimulus. Notice the lack of a significant response in the average TRF or a region of significance over the null model. Similar results were observed for the carrier stimuli: **(Bottom left)** Average TRF of the carrier predictor derived from the male speaker stimulus. Note the significant response in the TRF at the same latency observed for the corresponding envelope TRF. **(Bottom right)** Average TRF of the carrier predictor derived from the female speaker stimulus. As in the case of the corresponding envelope TRF, there is no significant response observed for this TRF.

**Figure 4**
Comparison of attended and unattended TRFs for the male speech stimuli, in the cocktail-party setting. Solid black lines indicate the TRF grand average (over TRF amplitude, averaged across voxels in the ROI); shaded regions indicate values within one standard error of the mean. Red shading indicates TRF values significantly above the noise floor. The distribution of TRF vectors in the brain at the time with the maximum significant response is plotted as an inset for each TRF. **(Top left)** Male speech envelope TRF for subjects attending to the male speech (female speech is background). A large significant response in the TRF is observed between ~30–50 ms which indicates a consistent, time-locked neural response to the speech envelope modulations at a 30–50 ms latency. **(Top right)** Male speech envelope TRF for subjects attending to the female speech (male speech is background). **(Bottom left)** Male speech carrier TRF for subjects attending to the male speech (female speech is background). **(Bottom right)** Male speech carrier TRF for subjects attending to the female speech (male speech is background). Linear mixed effects model and *post-hoc* test results indicate that the attended speech TRF peak amplitude is significantly greater than the unattended speech TRF peak amplitude.

**Figure 5**
Cocktail-party male speech TRF peak amplitude comparison across subjects. Male speech TRF peak amplitudes in the latency range 20–50 ms are presented for attend male (red) and ignore male (gray) conditions. Dashed lines show each individual subject's change in peak height between attend and ignore conditions. Solid lines show the change in the mean between the conditions. For the envelope TRFs, note the significant decrease in the mean value, and for most subjects, between the conditions. No such trend is observed in the carrier TRFs. ***p < 0.001.

See this image and copyright information in PMC

Update of

Cortical Responses Time-Locked to Continuous Speech in the High-Gamma Band Depend on Selective Attention.
Commuri V, Kulasingham JP, Simon JZ. Commuri V, et al. bioRxiv [Preprint]. 2023 Oct 15:2023.07.20.549567. doi: 10.1101/2023.07.20.549567. bioRxiv. 2023. Update in: Front Neurosci. 2023 Dec 14;17:1264453. doi: 10.3389/fnins.2023.1264453. PMID: 37546895 Free PMC article. Updated. Preprint.

Cited by

No Evidence of Musical Training Influencing the Cortical Contribution to the Speech-Frequency-Following Response and Its Modulation through Selective Attention.
Riegel J, Schüller A, Reichenbach T. Riegel J, et al. eNeuro. 2024 Sep 5;11(9):ENEURO.0127-24.2024. doi: 10.1523/ENEURO.0127-24.2024. Print 2024 Sep. eNeuro. 2024. PMID: 39160069 Free PMC article.
Fundamental frequency predominantly drives talker differences in auditory brainstem responses to continuous speech.
Polonenko MJ, Maddox RK. Polonenko MJ, et al. JASA Express Lett. 2024 Nov 1;4(11):114401. doi: 10.1121/10.0034329. JASA Express Lett. 2024. PMID: 39504231 Free PMC article.
Fundamental frequency predominantly drives talker differences in auditory brainstem responses to continuous speech.
Polonenko MJ, Maddox RK. Polonenko MJ, et al. bioRxiv [Preprint]. 2024 Jul 13:2024.07.12.603125. doi: 10.1101/2024.07.12.603125. bioRxiv. 2024. Update in: JASA Express Lett. 2024 Nov 1;4(11):114401. doi: 10.1121/10.0034329. PMID: 39026858 Free PMC article. Updated. Preprint.
Anatomically distinct cortical tracking of music and speech by slow (1-8Hz) and fast (70-120Hz) oscillatory activity.
Osorio S, Assaneo MF. Osorio S, et al. PLoS One. 2025 May 8;20(5):e0320519. doi: 10.1371/journal.pone.0320519. eCollection 2025. PLoS One. 2025. PMID: 40341725 Free PMC article.
Improving Tracking of Selective Attention in Hearing Aid Users: The Role of Noise Reduction and Nonlinearity Compensation.
Wilroth J, Alickovic E, Skoglund MA, Signoret C, Rönnberg J, Enqvist M. Wilroth J, et al. eNeuro. 2025 Feb 19;12(2):ENEURO.0275-24.2025. doi: 10.1523/ENEURO.0275-24.2025. Print 2025 Feb. eNeuro. 2025. PMID: 39880674 Free PMC article.

See all "Cited by" articles

References

1. Ahissar E., Nagarajan S., Ahissar M., Protopapas A., Mahncke H., Merzenich M. M. (2001). Speech comprehension is correlated with temporal response patterns recorded from auditory cortex. Proc. Natl. Acad. Sci. U.S.A. 98, 13367–13372. 10.1073/pnas.201400998 - DOI - PMC - PubMed
1. Basu M., Krishnan A., Weber-Fox C. (2010). Brainstem correlates of temporal auditory processing in children with specific language impairment. Dev. Sci. 13, 77–91. 10.1111/j.1467-7687.2009.00849.x - DOI - PubMed
1. Bates D., Mächler M., Bolker B., Walker S. (2015). Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48. 10.18637/jss.v067.i01 - DOI
1. Bidelman G. M. (2018). Subcortical sources dominate the neuroelectric auditory frequency-following response to speech. NeuroImage 175, 56–69. 10.1016/j.neuroimage.2018.03.060 - DOI - PubMed
1. Bidet-Caulet A., Fischer C., Besle J., Aguera P.-E., Giard M.-H., Bertrand O. (2007). Effects of selective attention on the electrophysiological representation of concurrent sounds in the human auditory cortex. J. Neurosci. 27, 9252–9261. 10.1523/JNEUROSCI.1402-07.2007 - DOI - PMC - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program

[1] Ahissar E., Nagarajan S., Ahissar M., Protopapas A., Mahncke H., Merzenich M. M. (2001). Speech comprehension is correlated with temporal response patterns recorded from auditory cortex. Proc. Natl. Acad. Sci. U.S.A. 98, 13367–13372. 10.1073/pnas.201400998 - DOI - PMC - PubMed

[2] Ahissar E., Nagarajan S., Ahissar M., Protopapas A., Mahncke H., Merzenich M. M. (2001). Speech comprehension is correlated with temporal response patterns recorded from auditory cortex. Proc. Natl. Acad. Sci. U.S.A. 98, 13367–13372. 10.1073/pnas.201400998 - DOI - PMC - PubMed

[3] Basu M., Krishnan A., Weber-Fox C. (2010). Brainstem correlates of temporal auditory processing in children with specific language impairment. Dev. Sci. 13, 77–91. 10.1111/j.1467-7687.2009.00849.x - DOI - PubMed

[4] Basu M., Krishnan A., Weber-Fox C. (2010). Brainstem correlates of temporal auditory processing in children with specific language impairment. Dev. Sci. 13, 77–91. 10.1111/j.1467-7687.2009.00849.x - DOI - PubMed

[5] Bates D., Mächler M., Bolker B., Walker S. (2015). Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48. 10.18637/jss.v067.i01 - DOI

[6] Bates D., Mächler M., Bolker B., Walker S. (2015). Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48. 10.18637/jss.v067.i01 - DOI

[7] Bidelman G. M. (2018). Subcortical sources dominate the neuroelectric auditory frequency-following response to speech. NeuroImage 175, 56–69. 10.1016/j.neuroimage.2018.03.060 - DOI - PubMed

[8] Bidelman G. M. (2018). Subcortical sources dominate the neuroelectric auditory frequency-following response to speech. NeuroImage 175, 56–69. 10.1016/j.neuroimage.2018.03.060 - DOI - PubMed

[9] Bidet-Caulet A., Fischer C., Besle J., Aguera P.-E., Giard M.-H., Bertrand O. (2007). Effects of selective attention on the electrophysiological representation of concurrent sounds in the human auditory cortex. J. Neurosci. 27, 9252–9261. 10.1523/JNEUROSCI.1402-07.2007 - DOI - PMC - PubMed

[10] Bidet-Caulet A., Fischer C., Besle J., Aguera P.-E., Giard M.-H., Bertrand O. (2007). Effects of selective attention on the electrophysiological representation of concurrent sounds in the human auditory cortex. J. Neurosci. 27, 9252–9261. 10.1523/JNEUROSCI.1402-07.2007 - DOI - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Cortical responses time-locked to continuous speech in the high-gamma band depend on selective attention

Affiliations

Cortical responses time-locked to continuous speech in the high-gamma band depend on selective attention

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Update of

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials

Abstract

Conflict of interest statement

Figures

Update of

Similar articles

Cited by

References

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials