The integration of continuous audio and visual speech in a cocktail-party environment depends on attention
- PMID: 37121375
- PMCID: PMC12956257
- DOI: 10.1016/j.neuroimage.2023.120143
The integration of continuous audio and visual speech in a cocktail-party environment depends on attention
Abstract
In noisy environments, our ability to understand speech benefits greatly from seeing the speaker's face. This is attributed to the brain's ability to integrate audio and visual information, a process known as multisensory integration. In addition, selective attention plays an enormous role in what we understand, the so-called cocktail-party phenomenon. But how attention and multisensory integration interact remains incompletely understood, particularly in the case of natural, continuous speech. Here, we addressed this issue by analyzing EEG data recorded from participants who undertook a multisensory cocktail-party task using natural speech. To assess multisensory integration, we modeled the EEG responses to the speech in two ways. The first assumed that audiovisual speech processing is simply a linear combination of audio speech processing and visual speech processing (i.e., an A + V model), while the second allows for the possibility of audiovisual interactions (i.e., an AV model). Applying these models to the data revealed that EEG responses to attended audiovisual speech were better explained by an AV model, providing evidence for multisensory integration. In contrast, unattended audiovisual speech responses were best captured using an A + V model, suggesting that multisensory integration is suppressed for unattended speech. Follow up analyses revealed some limited evidence for early multisensory integration of unattended AV speech, with no integration occurring at later levels of processing. We take these findings as evidence that the integration of natural audio and visual speech occurs at multiple levels of processing in the brain, each of which can be differentially affected by attention.
Keywords: Cocktail party; Hierarchical processing; Multisensory integration; Speech.
Copyright © 2023. Published by Elsevier Inc.
Conflict of interest statement
Declaration of Competing Interest The authors declare that they have no competing interests, financial or otherwise.
Figures
References
-
- Algazi VR Duda RO Thompson DM and Avendano C, “The CIPIC HRTF database,” Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575), New Platz, NY, USA, 2001, pp. 99–102, doi: 10.1109/ASPAA.2001.969552 - DOI
-
- Atilgan Huriye, Town Stephen M., Wood Katherine C., Jones Gareth P., Maddox Ross K., Lee Adrian K.C., Bizley Jennifer K., 2018. Integration of Visual Information in Auditory Cortex Promotes Auditory Scene Analysis through Multisensory Binding. Neuron 97 (3), 640–655. doi: 10.1016/j.neuron.2017.12.034, e4. - DOI - PMC - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
