The processing of audio-visual speech: empirical and neural bases
- PMID: 17827105
- PMCID: PMC2606792
- DOI: 10.1098/rstb.2007.2155
The processing of audio-visual speech: empirical and neural bases
Abstract
In this selective review, I outline a number of ways in which seeing the talker affects auditory perception of speech, including, but not confined to, the McGurk effect. To date, studies suggest that all linguistic levels are susceptible to visual influence, and that two main modes of processing can be described: a complementary mode, whereby vision provides information more efficiently than hearing for some under-specified parts of the speech stream, and a correlated mode, whereby vision partially duplicates information about dynamic articulatory patterning.Cortical correlates of seen speech suggest that at the neurological as well as the perceptual level, auditory processing of speech is affected by vision, so that 'auditory speech regions' are activated by seen speech. The processing of natural speech, whether it is heard, seen or heard and seen, activates the perisylvian language regions (left>right). It is highly probable that activation occurs in a specific order. First, superior temporal, then inferior parietal and finally inferior frontal regions (left>right) are activated. There is some differentiation of the visual input stream to the core perisylvian language system, suggesting that complementary seen speech information makes special use of the visual ventral processing stream, while for correlated visual speech, the dorsal processing stream, which is sensitive to visual movement, may be relatively more involved.
Figures


Similar articles
-
[Auditory perception and language: functional imaging of speech sensitive auditory cortex].Rev Neurol (Paris). 2001 Sep;157(8-9 Pt 1):837-46. Rev Neurol (Paris). 2001. PMID: 11677406 Review. French.
-
Cross-modal binding and activated attentional networks during audio-visual speech integration: a functional MRI study.Cereb Cortex. 2005 Nov;15(11):1750-60. doi: 10.1093/cercor/bhi052. Epub 2005 Feb 16. Cereb Cortex. 2005. PMID: 15716468 Clinical Trial.
-
Reading speech from still and moving faces: the neural substrates of visible speech.J Cogn Neurosci. 2003 Jan 1;15(1):57-70. doi: 10.1162/089892903321107828. J Cogn Neurosci. 2003. PMID: 12590843
-
Predicting "When" in Discourse Engages the Human Dorsal Auditory Stream: An fMRI Study Using Naturalistic Stories.J Neurosci. 2016 Nov 30;36(48):12180-12191. doi: 10.1523/JNEUROSCI.4100-15.2016. J Neurosci. 2016. PMID: 27903727 Free PMC article.
-
Neuronal basis of speech comprehension.Hear Res. 2014 Jan;307:121-35. doi: 10.1016/j.heares.2013.09.011. Epub 2013 Oct 7. Hear Res. 2014. PMID: 24113115 Review.
Cited by
-
Influences of selective adaptation on perception of audiovisual speech.J Phon. 2016 May;56:75-84. doi: 10.1016/j.wocn.2016.02.004. J Phon. 2016. PMID: 27041781 Free PMC article.
-
Sensitivity of occipito-temporal cortex, premotor and Broca's areas to visible speech gestures in a familiar language.PLoS One. 2020 Jun 19;15(6):e0234695. doi: 10.1371/journal.pone.0234695. eCollection 2020. PLoS One. 2020. PMID: 32559213 Free PMC article.
-
Systematic literature review on audio-visual multimodal input in listening comprehension.Front Psychol. 2022 Sep 6;13:980133. doi: 10.3389/fpsyg.2022.980133. eCollection 2022. Front Psychol. 2022. PMID: 36160551 Free PMC article.
-
Brain responses and looking behavior during audiovisual speech integration in infants predict auditory speech comprehension in the second year of life.Front Psychol. 2013 Jul 16;4:432. doi: 10.3389/fpsyg.2013.00432. eCollection 2013. Front Psychol. 2013. PMID: 23882240 Free PMC article.
-
The natural statistics of audiovisual speech.PLoS Comput Biol. 2009 Jul;5(7):e1000436. doi: 10.1371/journal.pcbi.1000436. Epub 2009 Jul 17. PLoS Comput Biol. 2009. PMID: 19609344 Free PMC article.
References
-
- Alsius A, Navarra J, Campbell R, Soto-Faraco S.S. Audiovisual integration of speech falters under high attention demands. Curr. Biol. 2005;15:839–843. doi:10.1016/j.cub.2005.03.046 - DOI - PubMed
-
- Andersson U, Lidestam B. Bottom-up driven speechreading in a speechreading expert: the case of AA (JK023) Ear Hear. 2005;26:214–224. doi:10.1097/00003446-200504000-00008 - DOI - PubMed
-
- Auer E.T, Jr, Bernstein L.E. Speechreading and the structure of the lexicon: computationally modelling the effects of reduced phonetic distinctiveness on lexical uniqueness. J. Acoust. Soc. Am. 1997;102:3704–3710. doi:10.1121/1.420402 - DOI - PubMed
-
- Bernstein L.E, Auer E.T, Moore J.K, Ponton C.W, Don M, Singh M. Visual speech perception without primary auditory cortex activation. Neuroreport. 2002;13:311–315. doi:10.1097/00001756-200203040-00013 - DOI - PubMed
-
- Bernstein L.E, Auer E.T, Jr, Moore J.K. Audiovisual speech binding: convergence or association? In: Calvert G.A, Spence C, Stein B.E, editors. The handbook of multisensory perception. MIT Press; Cambridge, MA: 2004a. pp. 203–224.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources