Monkeys and humans share a common computation for face/voice integration
- PMID: 21998576
- PMCID: PMC3182859
- DOI: 10.1371/journal.pcbi.1002165
Monkeys and humans share a common computation for face/voice integration
Abstract
Speech production involves the movement of the mouth and other regions of the face resulting in visual motion cues. These visual cues enhance intelligibility and detection of auditory speech. As such, face-to-face speech is fundamentally a multisensory phenomenon. If speech is fundamentally multisensory, it should be reflected in the evolution of vocal communication: similar behavioral effects should be observed in other primates. Old World monkeys share with humans vocal production biomechanics and communicate face-to-face with vocalizations. It is unknown, however, if they, too, combine faces and voices to enhance their perception of vocalizations. We show that they do: monkeys combine faces and voices in noisy environments to enhance their detection of vocalizations. Their behavior parallels that of humans performing an identical task. We explored what common computational mechanism(s) could explain the pattern of results we observed across species. Standard explanations or models such as the principle of inverse effectiveness and a "race" model failed to account for their behavior patterns. Conversely, a "superposition model", positing the linear summation of activity patterns in response to visual and auditory components of vocalizations, served as a straightforward but powerful explanatory mechanism for the observed behaviors in both species. As such, it represents a putative homologous mechanism for integrating faces and voices across primates.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures







Similar articles
-
Dynamic faces speed up the onset of auditory cortical spiking responses during vocal detection.Proc Natl Acad Sci U S A. 2013 Nov 26;110(48):E4668-77. doi: 10.1073/pnas.1312518110. Epub 2013 Nov 11. Proc Natl Acad Sci U S A. 2013. PMID: 24218574 Free PMC article.
-
Interactions between the superior temporal sulcus and auditory cortex mediate dynamic face/voice integration in rhesus monkeys.J Neurosci. 2008 Apr 23;28(17):4457-69. doi: 10.1523/JNEUROSCI.0541-08.2008. J Neurosci. 2008. PMID: 18434524 Free PMC article.
-
Mouth and Voice: A Relationship between Visual and Auditory Preference in the Human Superior Temporal Sulcus.J Neurosci. 2017 Mar 8;37(10):2697-2708. doi: 10.1523/JNEUROSCI.2914-16.2017. Epub 2017 Feb 8. J Neurosci. 2017. PMID: 28179553 Free PMC article.
-
A "voice patch" system in the primate brain for processing vocal information?Hear Res. 2018 Sep;366:65-74. doi: 10.1016/j.heares.2018.04.010. Epub 2018 May 7. Hear Res. 2018. PMID: 29776691 Review.
-
Bird speech perception and vocal production: a comparison with humans.Hum Biol. 2011 Apr;83(2):191-212. doi: 10.3378/027.083.0204. Hum Biol. 2011. PMID: 21615286 Review.
Cited by
-
Audiovisual detection at different intensities and delays.J Math Psychol. 2019 Aug;91:159-175. doi: 10.1016/j.jmp.2019.05.001. Epub 2019 Jul 2. J Math Psychol. 2019. PMID: 31404455 Free PMC article.
-
Coding of vocalizations by single neurons in ventrolateral prefrontal cortex.Hear Res. 2013 Nov;305:135-43. doi: 10.1016/j.heares.2013.07.011. Epub 2013 Jul 26. Hear Res. 2013. PMID: 23895874 Free PMC article. Review.
-
Who is That? Brain Networks and Mechanisms for Identifying Individuals.Trends Cogn Sci. 2015 Dec;19(12):783-796. doi: 10.1016/j.tics.2015.09.002. Epub 2015 Oct 7. Trends Cogn Sci. 2015. PMID: 26454482 Free PMC article. Review.
-
Kernel method based human model for enhancing interactive evolutionary optimization.ScientificWorldJournal. 2015;2015:185860. doi: 10.1155/2015/185860. Epub 2015 Mar 23. ScientificWorldJournal. 2015. PMID: 25879050 Free PMC article.
-
Honey bees respond to multimodal stimuli following the principle of inverse effectiveness.J Exp Biol. 2022 May 15;225(10):jeb243832. doi: 10.1242/jeb.243832. Epub 2022 May 24. J Exp Biol. 2022. PMID: 35531628 Free PMC article.
References
-
- Ohala J. Temporal Regulation of Speech. In: Fant G, Tatham MAA, editors. Auditory Analysis and Perception of Speech. London: Academic Press; 1975.
-
- Summerfield Q. Some preliminaries to a comprehensive account of audio-visual speech perception. In: Dodd B, Campbell R, editors. Hearing by Eye: The Psychology of Lipreading. Hillsdale, New Jersey: Lawrence Earlbaum; 1987. pp. 3–51.
-
- Summerfield Q. Lipreading and Audio-Visual Speech Perception. Philos Trans Roy Soc B. 1992;335:71–78. - PubMed
-
- Yehia H, Rubin P, Vatikiotis-Bateson E. Quantitative association of vocal-tract and facial behavior. Speech Comm. 1998;26:23–43.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Research Materials