Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Aug 19;23(16):1585-9.
doi: 10.1016/j.cub.2013.06.042. Epub 2013 Jul 25.

Lexical influences on auditory streaming

Affiliations

Lexical influences on auditory streaming

Alexander J Billig et al. Curr Biol. .

Abstract

Biologically salient sounds, including speech, are rarely heard in isolation. Our brains must therefore organize the input arising from multiple sources into separate "streams" and, in the case of speech, map the acoustic components of the target signal onto meaning. These auditory and linguistic processes have traditionally been considered to occur sequentially and are typically studied independently [1, 2]. However, evidence that streaming is modified or reset by attention [3], and that lexical knowledge can affect reports of speech sound identity [4, 5], suggests that higher-level factors may influence perceptual organization. In two experiments, listeners heard sequences of repeated words or acoustically matched nonwords. After several presentations, they reported that the initial /s/ sound in each syllable formed a separate stream; the percept then fluctuated between the streamed and fused states in a bistable manner. In addition to measuring these verbal transformations, we assessed streaming objectively by requiring listeners to detect occasional targets-syllables containing a gap after the initial /s/. Performance was better when streaming caused the syllables preceding the target to transform from words into nonwords, rather than from nonwords into words. Our results show that auditory stream formation is influenced not only by the acoustic properties of speech sounds, but also by higher-level processes involved in recognizing familiar words.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Experiment One Procedure (A) Schematic illustration of the time course of the start of one trial, in which the syllable is initially heard (and reported) as fused (“stone”) and then splits into two streams (“s” plus “dohne”). In this example, only the first of the two presented targets is detected. (B) Response interface in a trial where the syllable “stone” was presented. See also Figure S1.
Figure 2
Figure 2
Experiment One Temporal Dynamics (A) Durations (on a log scale) of the first seven response phase durations. Data are drawn from the 12 participants who showed at least seven phases per trial. Error bars show the SEM, with across-participant differences removed (suitable for repeated-measures comparisons). (B) Distribution of response phase durations (excluding the first and last phases), based on data from all participants for all control trials (i.e., without targets) containing more than one phase. Arrows indicate peaks in the distribution at multiples of the stimulus onset asynchrony (672.5 ms).
Figure 3
Figure 3
Experiment One Perceptual Report and Gap Detection Results Error bars show the SEM, with across-participant differences removed. (A) Proportion of time spent reporting the fused percept for stimuli where that form corresponded to a word or nonword. (B) Gap-detection sensitivity (d′) while reporting fused and streamed percepts. (C) Gap-detection sensitivity (d′) for syllables where the fused percept corresponded to a word or nonword, for each of the three gap sizes. (D) Between-participant correlation of the effects of lexicality on the time spent reporting the fused percept and on gap-detection sensitivity (averaged across gap sizes).
Figure 4
Figure 4
Experiment Two Design and Results (A) Eight example stimulus sequences, which factorially vary the lexical status of the precursor syllables and penultimate syllable, and the presence or absence of targets. The numerical subscripts indicate which of two different recordings of each syllable was presented, and targets are outlined with dotted lines. Note that an acoustic change always occurs at the penultimate syllable of a sequence, regardless of whether the lexical status is also different, or whether a target is present. (B) Gap-detection sensitivity (d′) for sequences in which the fused percept of the precursor syllables (left pair of bars) or penultimate syllable (right pair of bars) corresponded to a word or nonword. Error bars show the SEM, with across-participant differences removed. See also Figure S2.

References

    1. Bregman A.S. M.I.T. Press; Cambridge: 1990. Auditory Scene Analysis: The Perceptual Organization of Sound.
    1. Pisoni D.B., Remez R.E., editors. The Handbook of Speech Perception. Blackwell; Oxford: 2005.
    1. Carlyon R.P., Cusack R., Foxton J.M., Robertson I.H. Effects of attention and unilateral neglect on auditory stream segregation. J. Exp. Psychol. Hum. Percept. Perform. 2001;27:115–127. - PubMed
    1. Elman J.L., McClelland J.L. Cognitive penetration of the mechanisms of perception: Compensation for coarticulation of lexically restored phonemes. J. Mem. Lang. 1988;27:143–165.
    1. Shoaf L.C., Pitt M.A. Does node stability underlie the verbal transformation effect? A test of node structure theory. Percept. Psychophys. 2002;64:795–803. - PubMed

LinkOut - more resources