Why do I hear but not understand? Stochastic undersampling as a model of degraded neural encoding of speech

doi:10.3389/fnins.2014.00348

Review

. 2014 Oct 30:8:348.

doi: 10.3389/fnins.2014.00348. eCollection 2014.

Why do I hear but not understand? Stochastic undersampling as a model of degraded neural encoding of speech

Enrique A Lopez-Poveda¹

Affiliations

Affiliation

¹ Audición Computacional y Psicoacústica, Instituto de Neurociencias de Castilla y León, Universidad de Salamanca Salamanca, Spain ; Grupo de Audiología, Instituto de Investigación Biomédica de Salamanca, Universidad de Salamanca Salamanca, Spain ; Departamento de Cirugía, Facultad de Medicina, Universidad de Salamanca Salamanca, Spain.

PMID: 25400543
PMCID: PMC4214224
DOI: 10.3389/fnins.2014.00348

Review

Why do I hear but not understand? Stochastic undersampling as a model of degraded neural encoding of speech

Enrique A Lopez-Poveda. Front Neurosci. 2014.

. 2014 Oct 30:8:348.

doi: 10.3389/fnins.2014.00348. eCollection 2014.

Author

Enrique A Lopez-Poveda¹

Affiliation

¹ Audición Computacional y Psicoacústica, Instituto de Neurociencias de Castilla y León, Universidad de Salamanca Salamanca, Spain ; Grupo de Audiología, Instituto de Investigación Biomédica de Salamanca, Universidad de Salamanca Salamanca, Spain ; Departamento de Cirugía, Facultad de Medicina, Universidad de Salamanca Salamanca, Spain.

PMID: 25400543
PMCID: PMC4214224
DOI: 10.3389/fnins.2014.00348

Abstract

Hearing impairment is a serious disease with increasing prevalence. It is defined based on increased audiometric thresholds but increased thresholds are only partly responsible for the greater difficulty understanding speech in noisy environments experienced by some older listeners or by hearing-impaired listeners. Identifying the additional factors and mechanisms that impair intelligibility is fundamental to understanding hearing impairment but these factors remain uncertain. Traditionally, these additional factors have been sought in the way the speech spectrum is encoded in the pattern of impaired mechanical cochlear responses. Recent studies, however, are steering the focus toward impaired encoding of the speech waveform in the auditory nerve. In our recent work, we gave evidence that a significant factor might be the loss of afferent auditory nerve fibers, a pathology that comes with aging or noise overexposure. Our approach was based on a signal-processing analogy whereby the auditory nerve may be regarded as a stochastic sampler of the sound waveform and deafferentation may be described in terms of waveform undersampling. We showed that stochastic undersampling simultaneously degrades the encoding of soft and rapid waveform features, and that this degrades speech intelligibility in noise more than in quiet without significant increases in audiometric thresholds. Here, we review our recent work in a broader context and argue that the stochastic undersampling analogy may be extended to study the perceptual consequences of various different hearing pathologies and their treatment.

Keywords: aging; auditory deafferentation; auditory encoding; hearing impairment; hearing loss; speech intelligibility; speech processing; stochastic sampling.

PubMed Disclaimer

Figures

**Figure 1**
**A schematic illustration of the effects of stochastic undersampling on speech intelligibility in noise and in quiet**. Consider a speech intelligibility task (e.g., the identification of sentences) in different amounts of background noise. The blue trace depicts a hypothetical psychometric function showing performance (the percentage of correctly identified sentences) as a function of the amount of noise, with the latter expressed as the speech-to-noise (SNR) ratio in dB. The speech reception threshold (SRT) is, by definition, the SNR at which the listener correctly identifies 50% of the sentences. Consider now that stochastic undersampling reduces the effective SNR by a fixed amount, depicted by the red arrow. For a speech-in-quiet condition, such an SNR reduction barely degrades performance. By contrast, for a more challenging condition of speech in noise, the same SNR reduction degrades performance significantly.

**Figure 2**
**An example simulation of stochastic undersampling by deafferentation and its consequences on the waveform representation in quiet**. Consider a sound waveform (blue traces in **A,C,D**) and its full-wave rectified (FWR) version (green trace in A). Consider also four auditory nerve fibers each of which can fire along the sound waveform following a simple principle: the probability of firing is proportional to the instantaneous sound pressure in the FWR waveform. Since spikes are stochastic events, spike trains are different for the four fibers **(B)**. The green traces in **(C,D)** illustrate neural representations of the sound waveform that result from time-wise summation of only the upper two **(C)** or all four **(D)** spike trains, respectively. Clearly, the sound waveform is better represented in **(D)** than in **(C)**. To illustrate this more clearly, acoustical-waveform equivalents of the aggregated spikes trains are shown as red traces in (C,D). These were obtained by time-wise multiplication of the original waveform with an aggregated spike train obtained using a time-wise logical OR function (black spike trains in C,D). Clearly, the waveform reconstructed using four fibers resembles more closely the original waveform than that reconstructed using only two fibers (compare the red and blue traces in C,D). In other words, a reduction in the number of fibers degrades the neural representation of the sound waveform. For further details, see (Lopez-Poveda and Barrios, 2013).

**Figure 3**
**A visual example to illustrate the consequences of stochastic undersampling of a signal in quiet and in noise**. We used the stochastic sampling principles illustrated in Figure 1 (Lopez-Poveda and Barrios, 2013), whereby the probability of firing is proportional to intensity, or pixel darkness in this example. **(A,B)** The signal in quiet and in noise, respectively. The signal deliberately contains darker and lighter features that would correspond to intense and soft features in speech, respectively. It also contains thick and thin features that would correspond to low- and high-frequency features in speech, respectively. **(C,D)** Stochastically sampled images using 10 samplers per pixel. This number of samplers is sufficient to make this signal intelligible both in quiet **(C)** and in noise (D). **(E,F)** Stochastically sampled images using one stochastic sampler per pixel. Now the signal is still detectable and intelligible in quiet **(E)** but less so in noise **(F)**. Particularly degraded are the low-intensity (lighter gray) and high-frequency (thinner lines) features of the signal, like the “lo” portion of the upper “hello” word.

See this image and copyright information in PMC

Cited by

Age-Related Temporal Processing Deficits in Word Segments in Adult Cochlear-Implant Users.
Xie Z, Gaskins CR, Shader MJ, Gordon-Salant S, Anderson S, Goupell MJ. Xie Z, et al. Trends Hear. 2019 Jan-Dec;23:2331216519886688. doi: 10.1177/2331216519886688. Trends Hear. 2019. PMID: 31808373 Free PMC article.
Analysis of the Spanish Auditory Test of Speech in Noise (PAHRE) in a Population with Hearing Loss.
Rodríguez-Ferreiro M, Durán-Bouza M, Marrero-Aguiar V. Rodríguez-Ferreiro M, et al. Audiol Res. 2024 Sep 25;14(5):861-874. doi: 10.3390/audiolres14050073. Audiol Res. 2024. PMID: 39452465 Free PMC article.
Impaired noise adaptation contributes to speech intelligibility problems in people with hearing loss.
Marrufo-Pérez MI, Fumero MJ, Eustaquio-Martín A, Lopez-Poveda EA. Marrufo-Pérez MI, et al. Sci Rep. 2024 Nov 20;14(1):28807. doi: 10.1038/s41598-024-80131-9. Sci Rep. 2024. PMID: 39567602 Free PMC article.
Cochlear Neurotrophin-3 overexpression at mid-life prevents age-related inner hair cell synaptopathy and slows age-related hearing loss.
Cassinotti LR, Ji L, Borges BC, Cass ND, Desai AS, Kohrman DC, Liberman MC, Corfas G. Cassinotti LR, et al. Aging Cell. 2022 Oct;21(10):e13708. doi: 10.1111/acel.13708. Epub 2022 Sep 11. Aging Cell. 2022. PMID: 36088647 Free PMC article.
Molecular analysis of individual differences in talker search at the cocktail-party.
Lutfi RA, Pastore T, Rodriguez B, Yost WA, Lee J. Lutfi RA, et al. J Acoust Soc Am. 2022 Sep;152(3):1804. doi: 10.1121/10.0014116. J Acoust Soc Am. 2022. PMID: 36182280 Free PMC article.

See all "Cited by" articles

References

1. ANSI S3.5. (2007). Methods for Calculation of the Speech Intelligibility Index. New York: American National Standards Institute
1. Baer T., Moore B. C. J. (1993). Effects of spectral smearing on the intelligibility of sentences in noise. J. Acoust. Soc. Am. 94, 1241. 10.1121/1.408176 - DOI
1. Bernstein J. G., Mehraei G., Shamma S., Gallun F. J., Theodoroff S. M., Leek M. R. (2013). Spectrotemporal modulation sensitivity as a predictor of speech intelligibility for hearing-impaired listeners. J. Am. Acad. Audiol. 24, 293–306. 10.3766/jaaa.24.4.5 - DOI - PMC - PubMed
1. Brown G. J., Ferry R. T., Meddis R. (2010). A computer model of auditory efferent suppression: implications for the recognition of speech in noise. J. Acoust. Soc. Am. 127, 943–954. 10.1121/1.3273893 - DOI - PubMed
1. CHABA. (1988). Speech understanding and aging. J. Acoust. Soc. Am. 83, 859–895. 10.1121/1.395965 - DOI - PubMed

Publication types

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

[1] ANSI S3.5. (2007). Methods for Calculation of the Speech Intelligibility Index. New York: American National Standards Institute

[2] ANSI S3.5. (2007). Methods for Calculation of the Speech Intelligibility Index. New York: American National Standards Institute

[3] Baer T., Moore B. C. J. (1993). Effects of spectral smearing on the intelligibility of sentences in noise. J. Acoust. Soc. Am. 94, 1241. 10.1121/1.408176 - DOI

[4] Baer T., Moore B. C. J. (1993). Effects of spectral smearing on the intelligibility of sentences in noise. J. Acoust. Soc. Am. 94, 1241. 10.1121/1.408176 - DOI

[5] Bernstein J. G., Mehraei G., Shamma S., Gallun F. J., Theodoroff S. M., Leek M. R. (2013). Spectrotemporal modulation sensitivity as a predictor of speech intelligibility for hearing-impaired listeners. J. Am. Acad. Audiol. 24, 293–306. 10.3766/jaaa.24.4.5 - DOI - PMC - PubMed

[6] Bernstein J. G., Mehraei G., Shamma S., Gallun F. J., Theodoroff S. M., Leek M. R. (2013). Spectrotemporal modulation sensitivity as a predictor of speech intelligibility for hearing-impaired listeners. J. Am. Acad. Audiol. 24, 293–306. 10.3766/jaaa.24.4.5 - DOI - PMC - PubMed

[7] Brown G. J., Ferry R. T., Meddis R. (2010). A computer model of auditory efferent suppression: implications for the recognition of speech in noise. J. Acoust. Soc. Am. 127, 943–954. 10.1121/1.3273893 - DOI - PubMed

[8] Brown G. J., Ferry R. T., Meddis R. (2010). A computer model of auditory efferent suppression: implications for the recognition of speech in noise. J. Acoust. Soc. Am. 127, 943–954. 10.1121/1.3273893 - DOI - PubMed

[9] CHABA. (1988). Speech understanding and aging. J. Acoust. Soc. Am. 83, 859–895. 10.1121/1.395965 - DOI - PubMed

[10] CHABA. (1988). Speech understanding and aging. J. Acoust. Soc. Am. 83, 859–895. 10.1121/1.395965 - DOI - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Why do I hear but not understand? Stochastic undersampling as a model of degraded neural encoding of speech

Affiliation

Why do I hear but not understand? Stochastic undersampling as a model of degraded neural encoding of speech

Author

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

LinkOut - more resources

Full Text Sources

Other Literature Sources