Neural Measures of Pitch Processing in EEG Responses to Running Speech

doi:10.3389/fnins.2021.738408

. 2021 Dec 21:15:738408.

doi: 10.3389/fnins.2021.738408. eCollection 2021.

Neural Measures of Pitch Processing in EEG Responses to Running Speech

Florine L Bachmann¹, Ewen N MacDonald², Jens Hjortkjær^{1

3}

Affiliations

¹ Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark.
² Department of Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada.
³ Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital - Amager and Hvidovre, Copenhagen, Denmark.

PMID: 35002597
PMCID: PMC8729880
DOI: 10.3389/fnins.2021.738408

Neural Measures of Pitch Processing in EEG Responses to Running Speech

Florine L Bachmann et al. Front Neurosci. 2021.

. 2021 Dec 21:15:738408.

doi: 10.3389/fnins.2021.738408. eCollection 2021.

Authors

Florine L Bachmann¹, Ewen N MacDonald², Jens Hjortkjær^{1

3}

Affiliations

¹ Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark.
² Department of Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada.
³ Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital - Amager and Hvidovre, Copenhagen, Denmark.

PMID: 35002597
PMCID: PMC8729880
DOI: 10.3389/fnins.2021.738408

Abstract

Linearized encoding models are increasingly employed to model cortical responses to running speech. Recent extensions to subcortical responses suggest clinical perspectives, potentially complementing auditory brainstem responses (ABRs) or frequency-following responses (FFRs) that are current clinical standards. However, while it is well-known that the auditory brainstem responds both to transient amplitude variations and the stimulus periodicity that gives rise to pitch, these features co-vary in running speech. Here, we discuss challenges in disentangling the features that drive the subcortical response to running speech. Cortical and subcortical electroencephalographic (EEG) responses to running speech from 19 normal-hearing listeners (12 female) were analyzed. Using forward regression models, we confirm that responses to the rectified broadband speech signal yield temporal response functions consistent with wave V of the ABR, as shown in previous work. Peak latency and amplitude of the speech-evoked brainstem response were correlated with standard click-evoked ABRs recorded at the vertex electrode (Cz). Similar responses could be obtained using the fundamental frequency (F0) of the speech signal as model predictor. However, simulations indicated that dissociating responses to temporal fine structure at the F0 from broadband amplitude variations is not possible given the high co-variance of the features and the poor signal-to-noise ratio (SNR) of subcortical EEG responses. In cortex, both simulations and data replicated previous findings indicating that envelope tracking on frontal electrodes can be dissociated from responses to slow variations in F0 (relative pitch). Yet, no association between subcortical F0-tracking and cortical responses to relative pitch could be detected. These results indicate that while subcortical speech responses are comparable to click-evoked ABRs, dissociating pitch-related processing in the auditory brainstem may be challenging with natural speech stimuli.

Keywords: EEG; auditory brainstem response; encoding model; neural tracking; running speech; subcortical; temporal response function.

PubMed Disclaimer

Conflict of interest statement

The authors declare that this study received funding from Sonova, a major hearing care solutions company (FB). The funder was not involved in the study design, collection, analysis, interpretation of data, the writing of this article or the decision to submit it for publication. The authors declare that the research was conducted in the absence of other commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
Processing pipeline of the audio and EEG in the subcortical speech response analysis. Processing steps that are identical for the broadband and the F0 band-passed features are shown in gray. Processing steps only applied for the broadband or the F0 band-passed approach are indicated with light blue (dotted line), and darker blue (dashed line), respectively. Apart from the different stimulus-response models, the two approaches mainly differ in that a half-wave rectification is applied for the broadband approach, and a 240 Hz low-pass filter is applied for the F0 band-passed approach.

**Figure 2**
Speech stimulus **(A)** and audio features **(B–E)** used as model predictors in the different analyses. Half-wave rectified high-passed (80 Hz) broadband **(B)** and band-passed from 80 to 240 Hz, around F0 **(C)**, features were used for modeling subcortical activity. Low-frequency changes in speech amplitude **(D)** and the relative F0 trajectory **(E)** for modeling cortical activity. Amplitude and F0 values were z-standardized prior to plotting.

**Figure 3**
Modeled subcortical responses to different speech features. All traces show the response at electrode Cz. Shaded areas indicate ±1 standard error of the mean across participants. **(A)** Modeled responses for different speech features and topographies at mean peak latencies. Top: Complex cross-correlation between the F0 band-passed speech and the EEG (80–240 Hz). Middle: The EEG (>80 Hz) regressed onto the broadband (>80 Hz) rectified speech signal using ordinary least-squares regression. Bottom: Conventional click-evoked ABRs. **(B)** Subcortical response functions modeled using the F0 band-passed (top) or broadband features (middle) and estimated using cross-correlation or regression. For regression, both the unregularized ordinary least-squares and a highly regularized solution (λ = 10⁹) is shown.

**Figure 4**
Simulated prediction accuracies for the different envelope and pitch-related features as a function of SNR. Top: comparison of the F0 band-passed and broadband rectified speech features considered in the subcortical response analysis. Bottom: comparison of the envelope and relative pitch features considered in the cortical analysis. Purple: Simulated regression prediction accurracies for the best performing individual feature model. Red: Improvement in prediction accuracy by combining the two features relative to the best performing individual model. Yellow: Improvement in prediction accuracy by combining the relevant features with an uncorrelated predictor as an estimate of the upper limit of prediction improvement.

**Figure 5**
Simulated TRF responses for features with lower (left) or higher (right) degrees of autocorrelation. Bottom panels show the autocorrelation matrices of two simulated features (filtered random Gaussian variables). Top panels show the true (dashed lines) and estimated TRFs for different degrees of regularization (normalized amplitudes). For the more autocorelated feature, higher regularization is required to estimate the true TRF, but overregularization leads to temporal smearing of the response function.

**Figure 6**
Prediction accuracy (Pearson's r) of cortical EEG for regression models containing relative pitch, envelope, or both features as predictors. Gray lines indicate individual subject data. The combined model yields significantly higher accuracy compared to either of the individual models, suggesting that both predictors explain a unique part of the variance. The envelope model shows significantly higher prediction accuracy compared to the relative pitch model.

See this image and copyright information in PMC

Cited by

Auditory Encoding of Natural Speech at Subcortical and Cortical Levels Is Not Indicative of Cognitive Decline.
Bolt E, Giroud N. Bolt E, et al. eNeuro. 2024 May 9;11(5):ENEURO.0545-23.2024. doi: 10.1523/ENEURO.0545-23.2024. Print 2024 May. eNeuro. 2024. PMID: 38658138 Free PMC article.
Comparing methods for deriving the auditory brainstem response to continuous speech in human listeners.
Shan T, Maddox RK. Shan T, et al. Imaging Neurosci (Camb). 2025 Jun 3;3:IMAG.a.19. doi: 10.1162/IMAG.a.19. eCollection 2025. Imaging Neurosci (Camb). 2025. PMID: 40800859 Free PMC article.
Predictors for estimating subcortical EEG responses to continuous speech.
Kulasingham JP, Bachmann FL, Eskelund K, Enqvist M, Innes-Brown H, Alickovic E. Kulasingham JP, et al. PLoS One. 2024 Feb 8;19(2):e0297826. doi: 10.1371/journal.pone.0297826. eCollection 2024. PLoS One. 2024. PMID: 38330068 Free PMC article.
Neural speech tracking in a virtual acoustic environment: audio-visual benefit for unscripted continuous speech.
Daeglau M, Otten J, Grimm G, Mirkovic B, Hohmann V, Debener S. Daeglau M, et al. Front Hum Neurosci. 2025 Apr 9;19:1560558. doi: 10.3389/fnhum.2025.1560558. eCollection 2025. Front Hum Neurosci. 2025. PMID: 40270565 Free PMC article.
Extending Subcortical EEG Responses to Continuous Speech to the Sound-Field.
Bachmann FL, Kulasingham JP, Eskelund K, Enqvist M, Alickovic E, Innes-Brown H. Bachmann FL, et al. Trends Hear. 2024 Jan-Dec;28:23312165241246596. doi: 10.1177/23312165241246596. Trends Hear. 2024. PMID: 38738341 Free PMC article.

See all "Cited by" articles

References

1. Anderson S., Parbery-Clark A., White-Schwoch T., Kraus N. (2013). Auditory brainstem response to complex sounds predicts self-reported speech-in-noise performance. J. Speech Lang. Hear. Res. 56, 31–43. 10.1044/1092-4388(2012/12-0043) - DOI - PMC - PubMed
1. Anderson S., Parbery-Clark A., Yi H.-G., Kraus N. (2011). A neural basis of speech-in-noise perception in older adults. Ear Hear. 32, 750–757. 10.1097/AUD.0b013e31822229d3 - DOI - PMC - PubMed
1. Benjamini Y., Yekutieli D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29, 1165–1188. 10.1214/aos/1013699998 - DOI - PubMed
1. Bidelman G., Powers L. (2018). Response properties of the human frequency-following response (FFR) to speech and non-speech sounds: level dependence, adaptation and phase-locking limits. Int. J. Audiol. 57, 665–672. 10.1080/14992027.2018.1470338 - DOI - PubMed
1. Bidelman G. M. (2018). Subcortical sources dominate the neuroelectric auditory frequency-following response to speech. Neuroimage 175, 56–69. 10.1016/j.neuroimage.2018.03.060 - DOI - PubMed

LinkOut - more resources

Full Text Sources

[1] Anderson S., Parbery-Clark A., White-Schwoch T., Kraus N. (2013). Auditory brainstem response to complex sounds predicts self-reported speech-in-noise performance. J. Speech Lang. Hear. Res. 56, 31–43. 10.1044/1092-4388(2012/12-0043) - DOI - PMC - PubMed

[2] Anderson S., Parbery-Clark A., White-Schwoch T., Kraus N. (2013). Auditory brainstem response to complex sounds predicts self-reported speech-in-noise performance. J. Speech Lang. Hear. Res. 56, 31–43. 10.1044/1092-4388(2012/12-0043) - DOI - PMC - PubMed

[3] Anderson S., Parbery-Clark A., Yi H.-G., Kraus N. (2011). A neural basis of speech-in-noise perception in older adults. Ear Hear. 32, 750–757. 10.1097/AUD.0b013e31822229d3 - DOI - PMC - PubMed

[4] Anderson S., Parbery-Clark A., Yi H.-G., Kraus N. (2011). A neural basis of speech-in-noise perception in older adults. Ear Hear. 32, 750–757. 10.1097/AUD.0b013e31822229d3 - DOI - PMC - PubMed

[5] Benjamini Y., Yekutieli D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29, 1165–1188. 10.1214/aos/1013699998 - DOI - PubMed

[6] Benjamini Y., Yekutieli D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29, 1165–1188. 10.1214/aos/1013699998 - DOI - PubMed

[7] Bidelman G., Powers L. (2018). Response properties of the human frequency-following response (FFR) to speech and non-speech sounds: level dependence, adaptation and phase-locking limits. Int. J. Audiol. 57, 665–672. 10.1080/14992027.2018.1470338 - DOI - PubMed

[8] Bidelman G., Powers L. (2018). Response properties of the human frequency-following response (FFR) to speech and non-speech sounds: level dependence, adaptation and phase-locking limits. Int. J. Audiol. 57, 665–672. 10.1080/14992027.2018.1470338 - DOI - PubMed

[9] Bidelman G. M. (2018). Subcortical sources dominate the neuroelectric auditory frequency-following response to speech. Neuroimage 175, 56–69. 10.1016/j.neuroimage.2018.03.060 - DOI - PubMed

[10] Bidelman G. M. (2018). Subcortical sources dominate the neuroelectric auditory frequency-following response to speech. Neuroimage 175, 56–69. 10.1016/j.neuroimage.2018.03.060 - DOI - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Neural Measures of Pitch Processing in EEG Responses to Running Speech

Affiliations

Neural Measures of Pitch Processing in EEG Responses to Running Speech

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources