Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Apr;24(4):293-306.
doi: 10.3766/jaaa.24.4.5.

Spectrotemporal modulation sensitivity as a predictor of speech intelligibility for hearing-impaired listeners

Affiliations

Spectrotemporal modulation sensitivity as a predictor of speech intelligibility for hearing-impaired listeners

Joshua G W Bernstein et al. J Am Acad Audiol. 2013 Apr.

Abstract

Background: A model that can accurately predict speech intelligibility for a given hearing-impaired (HI) listener would be an important tool for hearing-aid fitting or hearing-aid algorithm development. Existing speech-intelligibility models do not incorporate variability in suprathreshold deficits that are not well predicted by classical audiometric measures. One possible approach to the incorporation of such deficits is to base intelligibility predictions on sensitivity to simultaneously spectrally and temporally modulated signals.

Purpose: The likelihood of success of this approach was evaluated by comparing estimates of spectrotemporal modulation (STM) sensitivity to speech intelligibility and to psychoacoustic estimates of frequency selectivity and temporal fine-structure (TFS) sensitivity across a group of HI listeners.

Research design: The minimum modulation depth required to detect STM applied to an 86 dB SPL four-octave noise carrier was measured for combinations of temporal modulation rate (4, 12, or 32 Hz) and spectral modulation density (0.5, 1, 2, or 4 cycles/octave). STM sensitivity estimates for individual HI listeners were compared to estimates of frequency selectivity (measured using the notched-noise method at 500, 1000, 2000, and 4000 Hz), TFS processing ability (2 Hz frequency-modulation detection thresholds for 500, 1000, 2000, and 4000 Hz carriers) and sentence intelligibility in noise (at a 0 dB signal-to-noise ratio) that were measured for the same listeners in a separate study.

Study sample: Eight normal-hearing (NH) listeners and 12 listeners with a diagnosis of bilateral sensorineural hearing loss participated.

Data collection and analysis: STM sensitivity was compared between NH and HI listener groups using a repeated-measures analysis of variance. A stepwise regression analysis compared STM sensitivity for individual HI listeners to audiometric thresholds, age, and measures of frequency selectivity and TFS processing ability. A second stepwise regression analysis compared speech intelligibility to STM sensitivity and the audiogram-based Speech Intelligibility Index.

Results: STM detection thresholds were elevated for the HI listeners, but only for low rates and high densities. STM sensitivity for individual HI listeners was well predicted by a combination of estimates of frequency selectivity at 4000 Hz and TFS sensitivity at 500 Hz but was unrelated to audiometric thresholds. STM sensitivity accounted for an additional 40% of the variance in speech intelligibility beyond the 40% accounted for by the audibility-based Speech Intelligibility Index.

Conclusions: Impaired STM sensitivity likely results from a combination of a reduced ability to resolve spectral peaks and a reduced ability to use TFS information to follow spectral-peak movements. Combining STM sensitivity estimates with audiometric threshold measures for individual HI listeners provided a more accurate prediction of speech intelligibility than audiometric measures alone. These results suggest a significant likelihood of success for an STM-based model of speech intelligibility for HI listeners.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A–C: Example spectrograms for STM ripple stimuli with various combinations of direction, spectral density, and temporal rate. D: A complex spectrogram formed by adding together the three spectrograms from panels A–C.
Figure 2
Figure 2
Mean audiograms for the NH and HI listener groups.
Figure 3
Figure 3
Group-mean STM detection thresholds averaged across upward- and downward-moving conditions. Error bars indicate ±1 SE across listeners in each group.
Figure 4
Figure 4
Actual STM detection thresholds for individual HI listeners (for the 4 Hz, 2 c/o condition, averaged across upward- and downward-moving ripples) are plotted as a function of the threshold predicted by a linear regression model with the 500 Hz FM detection threshold and 4000 Hz ERB as inputs.
Figure 5
Figure 5
Speech reception performance for sentence keywords presented in stationary noise at a 0 dB SNR is plotted as a function of the audiogram-based SII for individual HI listeners.
Figure 6
Figure 6
Actual speech intelligibility is plotted as a function of the intelligibility predicted by a linear regression model with SII and STM sensitivity (4 Hz, 2 c/o) as inputs.

Comment in

Similar articles

Cited by

References

    1. American National Standards Institute (ANSI) Methods for Calculation of the Speech Intelligibility Index, S3.5. New York: American National Standards Institute; 1997.
    1. Bacon SP, Gleitman RM. Modulation detection in subjects with relatively flat hearing losses. J Speech Hear Res. 1992;35(3):642–653. - PubMed
    1. Bacon SP, Viemeister NF. Temporal modulation transfer functions in normal-hearing and hearing-impaired listeners. Audiology. 1985;24(2):117–134. - PubMed
    1. Bernstein LR, Green DM. The profile-analysis bandwidth. J Acoust Soc Am. 1987;81(6):1888–1895.
    1. Buss E, Hall JW, Grose JH. Temporal fine-structure cues to speech and pure tone modulation in observers with sensorineural hearing loss. Ear Hear. 2004;25(3):242–250. - PubMed

Publication types