Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2010 Jun;20(3):361-6.
doi: 10.1016/j.conb.2010.03.009. Epub 2010 Apr 22.

Behind the scenes of auditory perception

Affiliations
Review

Behind the scenes of auditory perception

Shihab A Shamma et al. Curr Opin Neurobiol. 2010 Jun.

Abstract

'Auditory scenes' often contain contributions from multiple acoustic sources. These are usually heard as separate auditory 'streams', which can be selectively followed over time. How and where these auditory streams are formed in the auditory system is one of the most fascinating questions facing auditory scientists today. Findings published within the past two years indicate that both cortical and subcortical processes contribute to the formation of auditory streams, and they raise important questions concerning the roles of primary and secondary areas of auditory cortex in this phenomenon. In addition, these findings underline the importance of taking into account the relative timing of neural responses, and the influence of selective attention, in the search for neural correlates of the perception of auditory streams.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The psychophysical, biological, and computational facets of the “auditory scene analysis” problem. (a) From bottom to top: Acoustic waves coming from different sound sources mingle in the propagating medium. The physical characteristics (intensity, frequency, spatial location) of the sounds produced by these sources become entangled before entering the listener’s ear. However, listeners rarely experience natural or artificial acoustic scenes as an inextricable jumble of sounds. Instead, they hear separate auditory “streams”, each having its own perceptual attributes (loudness, pitch, timbre, and perceived location). They can attend selectively to one of these streams and to its attributes (shown here in red). (b) The mammalian auditory system is a multi-storey building. From bottom to top: outer and middle ear, cochlea, and auditory nerve (AN); cochlear nucleus (CN) and superior olivary complex (SOC) in the brainstem; inferior colliculus (IC) in the midbrain; medial geniculate body (MGB) in the thalamus; primary and secondary auditory cortex (AC). The challenge, for auditory scientists, is to clarify how neural responses at each of these multiple processing stages relate (or not) to the listener’s perception, and at which stage the conscious percept of an auditory stream emerges. The task is further complicated by the presence of ascending and descending projections, which provide opportunity for both bottom-up and top-down influences at all stages, including the cochlea and the secondary AC. Moreover, cortical areas that are not traditionally considered part of the auditory system may additionally contribute. (c) A computational auditory scene-analysis model based loosely on physiological findings. From bottom to top: input waveforms (amplitude of vibration as a function of time) are transformed into representations involving other dimensions, such as frequency (a determinant of pitch and timbre), fundamental frequency (F0, a determinant of pitch), or location (left-right). Temporally coherent “events” in these representations are grouped within each dimension, as well as across different dimensions. This results in auditory streams with associated attributes. Stream formation also depends on how well the events are separated along the dimensions. For instance, non-synchronous events may still be grouped into a common stream if they are close in frequency or fundamental frequency. Finally, attention is directed selectively toward one of the streams. This enhances the representation(s) of that stream, and suppresses the representations of concurrent streams. The dashed lines and associated interrogation marks indicate potential influences of selective attention on the analysis of sound features, and on the stream-formation process itself.

References

    1. Bregman AS. Auditory Scene Analysis: The Perceptual Organisation of Sound. Cambridge, MA: MIT Press; 1990.
    1. McDermott J. The cocktail party problem. Curr Biol. 2009;19:R1024–R1027. - PubMed
    1. Carlyon RP. How the brain separates sounds. Trends Cogn Sci. 2004;8:465–471. - PubMed
    1. Griffiths TD, Warren JD. What is an auditory object? Nat Rev Neurosci. 2004;5:887–892. - PubMed
    1. Sinex DG. Spectral processing and sound source determination. Int Rev Neurobiol. 2005;70:371–398. - PMC - PubMed

Publication types

MeSH terms