Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Dec;13(6):835-52.
doi: 10.1007/s10162-012-0343-2. Epub 2012 Aug 8.

Level-dependent changes in perception of speech envelope cues

Affiliations

Level-dependent changes in perception of speech envelope cues

Judy R Dubno et al. J Assoc Res Otolaryngol. 2012 Dec.

Abstract

Level-dependent changes in temporal envelope fluctuations in speech and related changes in speech recognition may reveal effects of basilar-membrane nonlinearities. As a result of compression in the basilar-membrane response, the "effective" magnitude of envelope fluctuations may be reduced as speech level increases from lower level (more linear) to mid-level (more compressive) regions. With further increases to a more linear region, speech envelope fluctuations may become more pronounced. To assess these effects, recognition of consonants and key words in sentences was measured as a function of speech level for younger adults with normal hearing. Consonant-vowel syllables and sentences were spectrally degraded using "noise vocoder" processing to maximize perceptual effects of changes to the speech envelope. Broadband noise at a fixed signal-to-noise ratio maintained constant audibility as speech level increased. Results revealed significant increases in scores and envelope-dependent feature transmission from 45 to 60 dB SPL and decreasing scores and feature transmission from 60 to 85 dB SPL. This quadratic pattern, with speech recognition maximized at mid levels and poorer at lower and higher levels, is consistent with a role of cochlear nonlinearities in perception of speech envelope cues.

PubMed Disclaimer

Figures

FIG. 1
FIG. 1
One-third-octave band spectra of vocoded CV syllables (blue lines), sentences (red lines), and background noise (thick black lines) for five overall speech levels (45, 50, 60, 70, and 85 dB SPL). Mean quiet thresholds are also shown in each panel (triangles).
FIG. 2
FIG. 2
Mean (thick lines) and individual (thin lines) recognition scores plotted as a function of speech level, for consonants (blue, top) and key words in sentences (red, middle). Mean scores for the two speech materials are also displayed in the bottom panel. For clarity, some data points are offset along the abscissa. Error bars indicate ±1 standard deviation.
FIG. 3
FIG. 3
Key word recognition scores plotted against consonant recognition scores, for five speech levels (top to bottom panels). Pearson correlation coefficients and linear regressions are included in each panel.
FIG. 4
FIG. 4
Top: mean information transmitted plotted as a function of speech level for the three acoustic-phonetic features of voicing, manner of articulation, and place of articulation. Bottom: same as top panel, but for three sub-categories of manners of articulation (plosive, nasality, and frication).
FIG. 5
FIG. 5
Slope (percent per dB) at the highest speech level (85 dB SPL) plotted against slope at the lowest speech level (45 dB SPL), for recognition of consonants (top) and key words in sentences (bottom). Slopes for each speech material were computed from the polynomial fit applied to the score-level function for each subject. Pearson correlation coefficients and linear regression functions are included in each panel.
FIG. 6
FIG. 6
Slope (percent per dB) calculated from speech scores at the highest level (85 dB SPL) plotted against the range of scores for recognition of consonants (top) and key words in sentences (bottom). Pearson correlation coefficients and linear regression functions are included in each panel.
FIG. 7
FIG. 7
Top: DPOAE levels plotted as a function of L 2 for f 2 of 1.0 kHz. DPOAE input–output function slopes were computed from DPOAE levels recorded for L 2 between 40 and 65 dB SPL (red lines). Bottom: slopes of DPOAE input–output functions for f 2 at 1.0 kHz (from the top panel) plotted against DPOAE summed levels. The Pearson correlation coefficient and linear regression function are also included.
FIG. 8
FIG. 8
Slopes of DPOAE input–output functions for f 2 of 2.0 kHz plotted against the range of key word recognition scores. The Pearson correlation coefficient and linear regression function are also included.
FIG. 9
FIG. 9
Top: Slopes of DPOAE input–output functions for an f 2 of 1.0 kHz plotted against the change in consonant recognition scores with speech level increasing from 60 to 70 dB SPL. Bottom: slopes of DPOAE input–output functions for an f 2 of 2.0 kHz plotted against the change in key word recognition scores with speech level increasing from 60 to 70 dB SPL. Pearson correlation coefficients and linear regression functions are included in each panel.

Similar articles

Cited by

References

    1. Alves-Pinto A, Lopez-Poveda EA. Detection of high-frequency spectral notches as a function of level. J Acoust Soc Am. 2005;118:2458–2469. doi: 10.1121/1.2032067. - DOI - PubMed
    1. American National Standards Institute (2004) Specification for audiometers. ANSI S3.6-2004, American National Standards Institute, New York
    1. Guidelines for manual pure-tone threshold audiometry. MD: American Speech-Language-Hearing Association; 2005.
    1. Başkent D. Speech recognition in normal hearing and sensorineural hearing loss as a function of the number of spectral channels. J Acoust Soc Am. 2006;120:2908–2925. doi: 10.1121/1.2354017. - DOI - PubMed
    1. Bess FH, Josey AF, Humes LE. Performance-intensity functions in cochlear and eighth nerve disorders. Am J Otol. 1979;1:27–31. - PubMed

Publication types