Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2011;24(6):513-39.
doi: 10.1163/187847611X595864. Epub 2011 Sep 29.

Some behavioral and neurobiological constraints on theories of audiovisual speech integration: a review and suggestions for new directions

Affiliations
Review

Some behavioral and neurobiological constraints on theories of audiovisual speech integration: a review and suggestions for new directions

Nicholas Altieri et al. Seeing Perceiving. 2011.

Abstract

Summerfield (1987) proposed several accounts of audiovisual speech perception, a field of research that has burgeoned in recent years. The proposed accounts included the integration of discrete phonetic features, vectors describing the values of independent acoustical and optical parameters, the filter function of the vocal tract, and articulatory dynamics of the vocal tract. The latter two accounts assume that the representations of audiovisual speech perception are based on abstract gestures, while the former two assume that the representations consist of symbolic or featural information obtained from visual and auditory modalities. Recent converging evidence from several different disciplines reveals that the general framework of Summerfield's feature-based theories should be expanded. An updated framework building upon the feature-based theories is presented. We propose a processing model arguing that auditory and visual brain circuits provide facilitatory information when the inputs are correctly timed, and that auditory and visual speech representations do not necessarily undergo translation into a common code during information processing. Future research on multisensory processing in speech perception should investigate the connections between auditory and visual brain regions, and utilize dynamic modeling tools to further understand the timing and information processing mechanisms involved in audiovisual speech integration.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(a) This portion of the figure illustrates modality specific theories of integration (accounts (1) and (2)). Auditory and visual information undergo early sensory processing before translation into modality specific features relevant to spoken language. Depending on the account, modality specific information can be translated directly into phonemes or visemes, or alternatively, translated into spectral (or featural) information first before undergoing translation into higher order and more abstract units such as discrete, time invariant, phonological units. (b) Gestural theories of integration are illustrated here (accounts (3) and (4)). After early sensory encoding, relevant auditory and visual information are translated into gestural/motor codes, and subsequently mapped onto hypothetical vocal tract configurations, or alternatively, directly perceived as dynamic events. This theory does not rule out the possibility of the extraction of phonological information, although it assumes that phonological extraction would be post-perceptual or irrelevant for many aspects of real-time language processing.

Similar articles

Cited by

References

    1. Allman BJ, Keniston LP, Meredith MA. Not just for bimodal neurons anymore: the contribution of unimodal neurons to cortical multisensory processing. Brain Topography. 2009;21:157–167. - PMC - PubMed
    1. Altieri N. Toward a Unified Theory of Audiovisual Integration in Speech Perception. Indiana University; Bloomington, IN, USA: 2010.
    1. Auer ET, Jr., Bernstein LE. Enhanced visual speech perception in individuals with early onset hearing impairment. J. Speech Hearing Lang. Res. 2007;50:1157–1165. - PubMed
    1. Bernstein LE. Phonetic perception by the speech perceiving brain. In: Pisoni DB, Remez RE, editors. The Handbook of Speech Perception. Blackwell Publishing; Malden, MA, USA: 2005. pp. 79–98.
    1. Bernstein LE, Auer ET, Moore JK, Ponton C, Don M, Singh M. Visual speech perception without primary auditory cortex activation. NeuroReport. 2002;13:311–315. - PubMed

Publication types

LinkOut - more resources