Review

. 2008 Mar 12;363(1493):947-63.

doi: 10.1098/rstb.2007.2152.

Basic auditory processes involved in the analysis of speech sounds

Brian C J Moore¹

Affiliations

PMID: 17827102
PMCID: PMC2606789
DOI: 10.1098/rstb.2007.2152

Review

Basic auditory processes involved in the analysis of speech sounds

Brian C J Moore. Philos Trans R Soc Lond B Biol Sci. 2008.

. 2008 Mar 12;363(1493):947-63.

doi: 10.1098/rstb.2007.2152.

Author

Brian C J Moore¹

Affiliation

¹ Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, UK. bcjm@cam.ac.uk

PMID: 17827102
PMCID: PMC2606789
DOI: 10.1098/rstb.2007.2152

Abstract

This paper reviews the basic aspects of auditory processing that play a role in the perception of speech. The frequency selectivity of the auditory system, as measured using masking experiments, is described and used to derive the internal representation of the spectrum (the excitation pattern) of speech sounds. The perception of timbre and distinctions in quality between vowels are related to both static and dynamic aspects of the spectra of sounds. The perception of pitch and its role in speech perception are described. Measures of the temporal resolution of the auditory system are described and a model of temporal resolution based on a sliding temporal integrator is outlined. The combined effects of frequency and temporal resolution can be modelled by calculation of the spectro-temporal excitation pattern, which gives good insight into the internal representation of speech sounds. For speech presented in quiet, the resolution of the auditory system in frequency and time usually markedly exceeds the resolution necessary for the identification or discrimination of speech sounds, which partly accounts for the robust nature of speech perception. However, for people with impaired hearing, speech perception is often much less robust.

PubMed Disclaimer

Figures

**Figure 1**
Psychophysical tuning curves (PTCs) determined in simultaneous masking, using sinusoidal signals at 10 dB SL. For each curve, the solid circle below it indicates the frequency and level of the signal. The masker was a sinusoid which had a fixed starting phase relationship with the 50 ms signal. The masker level required for threshold is plotted as a function of masker frequency on a logarithmic scale. The dashed line shows the absolute threshold for the signal. Data from Vogten (1978).

**Figure 2**
Schematic illustration of the technique used by Patterson (1976) to determine the shape of the auditory filter. The threshold of the sinusoidal signal (indicated by the bold vertical line) is measured as a function of the width of a spectral notch in the noise masker. The amount of noise passing through the auditory filter centred at the signal frequency is proportional to the shaded areas.

**Figure 3**
A typical auditory filter shape determined using the notched-noise method. The filter is centred at 1 kHz. The relative response of the filter (in decibels) is plotted as a function of frequency.

**Figure 4**
Masking patterns for a narrowband noise masker centred at 410 Hz. Each curve shows the elevation in threshold of a pure-tone signal as a function of signal frequency. The overall noise level in dB SPL for each curve is indicated in the figure. Data from Egan & Hake (1950).

**Figure 5**
Excitation patterns for a 1000 Hz sinusoid at levels ranging from 20 to 90 dB SPL in 10 dB steps.

**Figure 6**
(a) The spectrum of a synthetic vowel /I/ plotted on a linear frequency scale. (b) The same spectrum plotted on an ERB_N-number scale. (c) The excitation pattern for the vowel plotted on an ERB_N-number scale.

**Figure 7**
The points labelled ‘R’ are thresholds for detecting a 1 kHz signal centred in a band of random noise, plotted as a function of the bandwidth of the noise. The points labelled ‘M’ are the thresholds obtained when the noise was amplitude modulated at an irregular, low rate. Reproduced with permission from Hall *et al*. (1984) and *J. Acous. Soc. Am*.

**Figure 8**
Excitation patterns for three vowels, /i/, /a/ and /u/, plotted on an ERB_N-number scale.

**Figure 9**
Illustration of the filters used by Watkins & Makin (1996a). (a,b) ‘Filters’ corresponding to the spectral envelopes of the vowels ‘/ε/’ and ‘/I/’, respectively. (c) Filter corresponding to the difference between the spectral envelopes of the vowels ‘/ε/’ and ‘/I/’.

**Figure 10**
A temporal modulation transfer function (TMTF). A broadband white noise was sinusoidally amplitude modulated, and the threshold amount of modulation required for detection is plotted as a function of modulation rate. The amount of modulation is specified as 20 log m, where m is the modulation index. The higher the sensitivity to modulation, the more negative is this quantity. Data from Bacon & Viemeister (1985).

**Figure 11**
Spectro-temporal excitation pattern (STEP) of the word ‘tips’. The figure was produced by Prof. C. J. Plack. Adapted from Moore (2003c).

See this image and copyright information in PMC

References

1. Aibara R, Welsh J.T, Puria S, Goode R.L. Human middle-ear sound transfer function and cochlear input impedance. Hear. Res. 2001;152:100–109. doi:10.1016/S0378-5955(00)00240-9 - DOI - PubMed
1. Alcántara J.I, Moore B.C.J, Vickers D.A. The relative role of beats and combination tones in determining the shapes of masking patterns at 2 kHz: I. Normal-hearing listeners. Hear. Res. 2000;148:63–73. doi:10.1016/S0378-5955(00)00114-3 - DOI - PubMed
1. ANSI. American National Standards Institute; New York, NY: 1994. ANSI S1.1-1994. American national standard acoustical terminology.
1. Bacon S.P, Viemeister N.F. Temporal modulation transfer functions in normal-hearing and hearing-impaired subjects. Audiology. 1985;24:117–134. - PubMed
1. Brungart D.S, Simpson B.D, Darwin C.J, Arbogast T.L, Kidd G., Jr Across-ear interference from parametrically degraded synthetic speech signals in a dichotic cocktail-party listening task. J. Acoust. Soc. Am. 2005;117:292–304. doi:10.1121/1.1835509 - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Basic auditory processes involved in the analysis of speech sounds

Affiliation

Basic auditory processes involved in the analysis of speech sounds

Author

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources