Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 15:15:670192.
doi: 10.3389/fnins.2021.670192. eCollection 2021.

The Relative Weight of Temporal Envelope Cues in Different Frequency Regions for Mandarin Disyllabic Word Recognition

Affiliations

The Relative Weight of Temporal Envelope Cues in Different Frequency Regions for Mandarin Disyllabic Word Recognition

Zhong Zheng et al. Front Neurosci. .

Abstract

Objectives: Acoustic temporal envelope (E) cues containing speech information are distributed across all frequency spectra. To provide a theoretical basis for the signal coding of hearing devices, we examined the relative weight of E cues in different frequency regions for Mandarin disyllabic word recognition in quiet.

Design: E cues were extracted from 30 continuous frequency bands within the range of 80 to 7,562 Hz using Hilbert decomposition and assigned to five frequency regions from low to high. Disyllabic word recognition of 20 normal-hearing participants were obtained using the E cues available in two, three, or four frequency regions. The relative weights of the five frequency regions were calculated using least-squares approach.

Results: Participants correctly identified 3.13-38.13%, 27.50-83.13%, or 75.00-93.13% of words when presented with two, three, or four frequency regions, respectively. Increasing the number of frequency region combinations improved recognition scores and decreased the magnitude of the differences in scores between combinations. This suggested a synergistic effect among E cues from different frequency regions. The mean weights of E cues of frequency regions 1-5 were 0.31, 0.19, 0.26, 0.22, and 0.02, respectively.

Conclusion: For Mandarin disyllabic words, E cues of frequency regions 1 (80-502 Hz) and 3 (1,022-1,913 Hz) contributed more to word recognition than other regions, while frequency region 5 (3,856-7,562) contributed little.

Keywords: Mandarin Chinese; disyllabic word; envelope cues; frequency region; relative weight.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Percent-correct scores for disyllabic word recognition using acoustic temporal envelope in two-frequency-region conditions.
FIGURE 2
FIGURE 2
Percent-correct scores for disyllabic word recognition using acoustic temporal envelope in three-frequency-region conditions.
FIGURE 3
FIGURE 3
Percent-correct scores for disyllabic word recognition using acoustic temporal envelope in four-frequency-region conditions.
FIGURE 4
FIGURE 4
The relative weights of different frequency regions for Mandarin disyllabic word and sentence recognition using acoustic temporal envelope. The data for Mandarin sentence recognition was adopted from a previous study (Guo et al., 2017). The error bars represent standard errors. * Statistically significant (p < 0.05).

References

    1. Apoux F., Bacon S. P. (2004). Relative importance of temporal information in various frequency regions for consonant identification in quiet and in noise. J. Acoust. Soc. Am. 116 1671–1680. 10.1121/1.1781329 - DOI - PubMed
    1. Ardoint M., Agus T., Sheft S., Lorenzi C. (2011). Importance of temporal-envelope speech cues in different spectral regions. J. Acoust. Soc. Am. 130 El115–El121. - PubMed
    1. Ardoint M., Lorenzi C. (2010). Effects of lowpass and highpass filtering on the intelligibility of speech based on temporal fine structure or envelope cues. Hear Res. 260 89–95. 10.1016/j.heares.2009.12.002 - DOI - PubMed
    1. Desroches A. S., Newman R. L., Joanisse M. F. (2009). Investigating the time course of spoken word recognition: electrophysiological evidence for the influences of phonological similarity. J. Cogn. Neurosci. 21 1893–1906. 10.1162/jocn.2008.21142 - DOI - PMC - PubMed
    1. Dolan R. J., Fink G. R., Rolls E., Booth M., Holmes A., Frackowiak R. S., et al. (1997). How the brain learns to see objects and faces in an impoverished context. Nature 389 596–599. 10.1038/39309 - DOI - PubMed

LinkOut - more resources