Role and relative contribution of temporal envelope and fine structure cues in sentence recognition by normal-hearing listeners

Frédéric Apoux¹, Sarah E Yoho, Carla L Youngdahl, Eric W Healy

Affiliations

PMID: 23967950
PMCID: PMC3765279
DOI: 10.1121/1.4816413

Role and relative contribution of temporal envelope and fine structure cues in sentence recognition by normal-hearing listeners

Frédéric Apoux et al. J Acoust Soc Am. 2013 Sep.

. 2013 Sep;134(3):2205-12.

doi: 10.1121/1.4816413.

Authors

Frédéric Apoux¹, Sarah E Yoho, Carla L Youngdahl, Eric W Healy

Affiliation

¹ Speech Psychoacoustics Laboratory, Department of Speech and Hearing Science, The Ohio State University, Columbus, Ohio 43210, USA. fred.apoux@gmail.com

PMID: 23967950
PMCID: PMC3765279
DOI: 10.1121/1.4816413

Abstract

The present study investigated the role and relative contribution of envelope and temporal fine structure (TFS) to sentence recognition in noise. Target and masker stimuli were added at five different signal-to-noise ratios (SNRs) and filtered into 30 contiguous frequency bands. The envelope and TFS were extracted from each band by Hilbert decomposition. The final stimuli consisted of the envelope of the target/masker sound mixture at x dB SNR and the TFS of the same sound mixture at y dB SNR. A first experiment showed a very limited contribution of TFS cues, indicating that sentence recognition in noise relies almost exclusively on temporal envelope cues. A second experiment showed that replacing the carrier of a sound mixture with noise (vocoder processing) cannot be considered equivalent to disrupting the TFS of the target signal by adding a background noise. Accordingly, a re-evaluation of the vocoder approach as a model to further understand the role of TFS cues in noisy situations may be necessary. Overall, these data are consistent with the view that speech information is primarily extracted from the envelope while TFS cues are primarily used to detect glimpses of the target.

PubMed Disclaimer

Figures

**Figure 1**
Schematic of the processing used to create the stimuli. The grayed area illustrates the processing within each of the 30 analysis filters.

**Figure 2**
Correlation coefficients between the original speech and the chimeric sound's envelopes (filled symbols) and TFS (open symbols) as a function of the SNR. The circles correspond to the SSN conditions while the squares correspond to the SPE conditions.

**Figure 3**
Average sentence recognition scores in speech-shaped noise (SSN) as a function of the SNR of the envelope (left panel) and as a function ofthe SNR of the TFS (right panel) with SNR_tfs and SNR_env as parameter, respectively. In each panel, a bold line (REF) connects the data points for which SNR_env and SNR_tfs were equal.

**Figure 4**
The same as Fig. 3 but for the speech masker (SPE).

**Figure 5**
Average sentence recognition scores as a function of the SNR of theenvelope (SNR_env). The parameter is the SNR of the TFS (SNR_tfs). The left and right panels show scores in speech-shaped noise (SSN) and speech (SPE), respectively. In each panel, the filled symbols correspond to the data from Exp. 2, while the open symbol corresponds to selected data from Exp. 1.

**Figure 6**
Average sentence recognition scores as a function of the SNR of the envelope (SNR_env). The left and right panels show scores in speech-shaped noise (SSN) and speech (SPE), respectively. In each panel, the filled symbols correspond to the data for three different SNR_tfs values while the black and white symbol corresponds to the noise-carrier data (i.e., vocoder).

See this image and copyright information in PMC

References

1. ANSI (2004). S3.21 (R2009), American National Standard Methods for Manual Pure-Tone Threshold Audiometry (Acoustical Society of America, New York: ).
1. ANSI (2010). S3.6-2010, American National Standard Specification for Audiometers (Acoustical Society of America, New York: ).
1. Apoux, F., and Bacon, S. P. (2004). “ Relative importance of temporal information in various frequency regions for consonant identification in quiet and in noise,” J. Acoust. Soc. Am. 116, 1671–1680. 10.1121/1.1781329 - DOI - PubMed
1. Apoux, F., and Healy, E. W. (2009). “ On the number of auditory filter outputs needed to understand speech: Further evidence for auditory channel independence,” Hear. Res. 255, 99–108. 10.1016/j.heares.2009.06.005 - DOI - PMC - PubMed
1. Apoux, F., and Healy, E. W. (2010). “ Relative contribution of off- and on-frequency spectral components of background noise to the masking of unprocessed and vocoded speech,” J. Acoust. Soc. Am. 128, 2075–2084. 10.1121/1.3478845 - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Role and relative contribution of temporal envelope and fine structure cues in sentence recognition by normal-hearing listeners

Affiliation

Role and relative contribution of temporal envelope and fine structure cues in sentence recognition by normal-hearing listeners

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous