Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014 Mar 31:8:60.
doi: 10.3389/fnins.2014.00060. eCollection 2014.

Predictability effects in auditory scene analysis: a review

Affiliations
Review

Predictability effects in auditory scene analysis: a review

Alexandra Bendixen. Front Neurosci. .

Abstract

Many sound sources emit signals in a predictable manner. The idea that predictability can be exploited to support the segregation of one source's signal emissions from the overlapping signals of other sources has been expressed for a long time. Yet experimental evidence for a strong role of predictability within auditory scene analysis (ASA) has been scarce. Recently, there has been an upsurge in experimental and theoretical work on this topic resulting from fundamental changes in our perspective on how the brain extracts predictability from series of sensory events. Based on effortless predictive processing in the auditory system, it becomes more plausible that predictability would be available as a cue for sound source decomposition. In the present contribution, empirical evidence for such a role of predictability in ASA will be reviewed. It will be shown that predictability affects ASA both when it is present in the sound source of interest (perceptual foreground) and when it is present in other sound sources that the listener wishes to ignore (perceptual background). First evidence pointing toward age-related impairments in the latter capacity will be addressed. Moreover, it will be illustrated how effects of predictability can be shown by means of objective listening tests as well as by subjective report procedures, with the latter approach typically exploiting the multi-stable nature of auditory perception. Critical aspects of study design will be delineated to ensure that predictability effects can be unambiguously interpreted. Possible mechanisms for a functional role of predictability within ASA will be discussed, and an analogy with the old-plus-new heuristic for grouping simultaneous acoustic signals will be suggested.

Keywords: auditory stream segregation; bistable perception; integration; old-plus-new heuristic; predictive coding; sound processing.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Experimental paradigm confounding predictability of the “Integrated” and “Segregated” perceptual organizations. French-St. George and Bregman (1989) compared stimulus arrangements that were predictable or unpredictable with respect to stimulus timing and frequency. Sequences were predictable on both dimensions (A), on neither dimension (B), or on just one of the dimensions (not depicted). The cyclically repeating (thereby predictable) frequency patterns are marked in the upper panel. Stimulus timing is additionally marked by corresponding ticks on the X axis. The dashed line indicates the (nominal) separation between the “A” and “B” groups of tones. Participants were asked to try perceiving all tones as originating from one sound source and to press a button as long as they succeed in maintaining this percept. Predictability was assumed to increase the perceptual coherence of the tone set, thereby leading to a higher probability of perceiving the sequence as one stream (“Integrated”). Yet as illustrated in the upper panel, adding predictability to the sequence as a whole unavoidably renders the two separate streams more predictable, too. Thus their individual perceptual coherence might increase as well, leading to a higher probability of perceiving the sequence as “Segregated.” These opposite effects might cancel each other out, leading to a null effect on average that would not be indicative of a general absence of predictability effects in ASA.
Figure 2
Figure 2
Experimental paradigm disentangling predictability of the “Integrated” and “Segregated” perceptual organizations. The depicted stimulus arrangements are predictable (A) or unpredictable (B) with respect to stimulus frequency. The cyclically repeating (thereby predictable) frequency patterns are marked in the upper panel. The dashed line indicates the (nominal) separation between the “A” and “B” groups of tones. Critically, the number of elements included in the predictable patterns differs between the “A” and “B” group of tones. As a result, the length of the cyclically repeating overall pattern comprising “A” and “B” tones amounts to 24 elements, which is considerably too long to be picked up by the auditory system (e.g., Boh et al., 2011). Consequently, from a perceptual point of view the predictability manipulation is directional: It affects only predictability of the “Segregated” perceptual organization, whereas predictability of the “Integrated” organization remains unchanged. This directional manipulation allows for an unambiguous investigation of predictability effects in ASA (e.g., Bendixen et al., 2010, 2013a).
Figure 3
Figure 3
Designs for studying predictability effects in auditory scene analysis. Six different levels of predictability are distinguished; and it is indicated for each of the previous studies which levels they have contrasted. Each level is schematically illustrated with a cutout of the corresponding stimulus sequence. Time is represented on the X axis and frequency on the Y axis of all panels. The schematic depiction uses the “ABA_” paradigm (Van Noorden, 1975), but studies have also used the “ABAB” paradigm or more irregular arrangements of “A” and “B” tones. Straight lines indicate the presence of predictive relations between successive tones (i.e., the feature values of one tone are predictive of the feature values of the next tone in one or both perceptual organizations). Dotted lines indicate random successions of feature values. The feature whose predictability was manipulated differs between studies (e.g., onset time, frequency, intensity, location). The effects on predictability of the “Integrated” (Int) and “Segregated” (Seg) organizations are marked with “++” (fully predictable), “+” (partially predictable), or “−” (unpredictable). The resulting predictability difference between the two organizations is marked in the “Diff” row. Following these differences, predictability conditions to the left should increase the likelihood of “Integrated” percepts, whereas predictability conditions to the right should increase the likelihood of “Segregated” percepts. Note that many studies have compared conditions in the middle of this scheme, and have revealed no clear effects of predictability on auditory perceptual organization. Studies employing directional manipulations have tended to investigate conditions where the “Segregated” organization was more predictable than the “Integrated” organization (depicted on the right of this scheme). No study has so far investigated a condition in which the “Integrated” but not the “Segregated” organization was fully predictable.
Figure 4
Figure 4
Possible cue effects in the subjective-report procedure. Artificially simplified time-courses of perceptual switching were generated for illustration purposes. The upper row (A) reflects a balanced distribution of “Integrated” and “Segregated” percepts. The middle row (B) shows the impact of a percept-stabilizing cue in favor of stream segregation, which prolongs the duration (i.e., stability) of “Segregated” percepts but leaves the duration of “Integrated” percepts unaffected. The lower row (C) shows the impact of a percept-inducing cue in favor of segregation, which prolongs the duration of “Segregated” percepts and additionally shortens the duration of “Integrated” percepts by causing perceptual switches back to the “Segregated” percept. The dashed lines in each panel mark the proportion and duration values from the balanced condition for comparison. Note that in this example, percept-stabilizing and percept-inducing cues have identical effects on the proportions of the two percepts, hence analyzing only the average proportions cannot differentiate between these qualitatively different types of cues. The average phase durations are informative about the cause of the changes in proportion, and thus about the underlying mechanism of the cue.

References

    1. Alain C., McDonald K. L. (2007). Age-related differences in neuromagnetic brain activity underlying concurrent sound perception. J. Neurosci. 27, 1308–1314 10.1523/JNEUROSCI.5433-06.2007 - DOI - PMC - PubMed
    1. Althen H., Grimm S., Escera C. (2013). Simple and complex acoustic regularities are encoded at different levels of the auditory hierarchy. Eur. J. Neurosci. 38, 3448–3455 10.1111/ejn.12346 - DOI - PubMed
    1. Andreou L.-V., Kashino M., Chait M. (2011). The role of temporal regularity in auditory segregation. Hear. Res. 280, 228–235 10.1016/j.heares.2011.06.001 - DOI - PubMed
    1. Anstis S., Saida S. (1985). Adaptation to auditory streaming of frequency-modulated tones. J. Exp. Psychol. Hum. Percept. Perform. 11, 257–271 10.1037/0096-1523.11.3.257 - DOI
    1. Arnal L. H., Giraud A.-L. (2012). Cortical oscillations and sensory predictions. Trends Cogn. Sci. 16, 390–398 10.1016/j.tics.2012.05.003 - DOI - PubMed

LinkOut - more resources