Speech-specific audiovisual integration modulates induced theta-band oscillations

Alma Lindborg¹, Martijn Baart^{2

3}, Jeroen J Stekelenburg², Jean Vroomen², Tobias S Andersen¹

Affiliations

¹ Section for Cognitive Systems, DTU Compute, Technical University of Denmark, Lyngby, Denmark.
² Department of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands.
³ BCBL. Basque Center on Cognition, Brain and Language, Donostia, Spain.

PMID: 31310616
PMCID: PMC6634411
DOI: 10.1371/journal.pone.0219744

Speech-specific audiovisual integration modulates induced theta-band oscillations

Alma Lindborg et al. PLoS One. 2019.

. 2019 Jul 16;14(7):e0219744.

doi: 10.1371/journal.pone.0219744. eCollection 2019.

Authors

Alma Lindborg¹, Martijn Baart^{2

3}, Jeroen J Stekelenburg², Jean Vroomen², Tobias S Andersen¹

Affiliations

¹ Section for Cognitive Systems, DTU Compute, Technical University of Denmark, Lyngby, Denmark.
² Department of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands.
³ BCBL. Basque Center on Cognition, Brain and Language, Donostia, Spain.

PMID: 31310616
PMCID: PMC6634411
DOI: 10.1371/journal.pone.0219744

Abstract

Speech perception is influenced by vision through a process of audiovisual integration. This is demonstrated by the McGurk illusion where visual speech (for example /ga/) dubbed with incongruent auditory speech (such as /ba/) leads to a modified auditory percept (/da/). Recent studies have indicated that perception of the incongruent speech stimuli used in McGurk paradigms involves mechanisms of both general and audiovisual speech specific mismatch processing and that general mismatch processing modulates induced theta-band (4-8 Hz) oscillations. Here, we investigated whether the theta modulation merely reflects mismatch processing or, alternatively, audiovisual integration of speech. We used electroencephalographic recordings from two previously published studies using audiovisual sine-wave speech (SWS), a spectrally degraded speech signal sounding nonsensical to naïve perceivers but perceived as speech by informed subjects. Earlier studies have shown that informed, but not naïve subjects integrate SWS phonetically with visual speech. In an N1/P2 event-related potential paradigm, we found a significant difference in theta-band activity between informed and naïve perceivers of audiovisual speech, suggesting that audiovisual integration modulates induced theta-band oscillations. In a McGurk mismatch negativity paradigm (MMN) where infrequent McGurk stimuli were embedded in a sequence of frequent audio-visually congruent stimuli we found no difference between congruent and McGurk stimuli. The infrequent stimuli in this paradigm are violating both the general prediction of stimulus content, and that of audiovisual congruence. Hence, we found no support for the hypothesis that audiovisual mismatch modulates induced theta-band oscillations. We also did not find any effects of audiovisual integration in the MMN paradigm, possibly due to the experimental design.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Time evolution of the negative cluster (p = 0.0200) in the 4–8 Hz band for the congruent AV condition.**
Sensors belonging to the cluster in bold.

**Fig 2. Grand average power by time (x-axis) and frequency (y-axis) for speech mode (upper row) and non-speech mode (lower row) groups in the N1/P2 dataset, at sensor level.**
In the non-speech mode group, enhanced theta-band activity is observed from around 100 ms to 400 ms. This effect is largely absent in the speech mode group for the audiovisual conditions, with the biggest between-groups difference for Audiovisual Congruent trials.

**Fig 3. Topographic distribution of grand average 4–8 Hz power for speech mode (top) and non-speech mode (bottom) at 0–300 ms.**

**Fig 4. Mean power over the SM < NSM cluster found for the pooled audiovisual conditions.**
Whiskers represent the standard error of the mean over participants.

**Fig 5. Grand average power at a central sensor for the MMN dataset, by group and condition.**
For the Speech mode group, there are no clear differences between the conditions, contrary to the mismatch hypothesis. For the Non-speech mode group, there seems to be a deviant < standard difference in the alpha and upper theta band, which cannot be explained by any of our hypotheses.

**Fig 6. Topographic distribution of grand average 4–8 Hz power for speech mode (left) and non-speech mode (right) at 200–500 ms.**

See this image and copyright information in PMC

References

1. Sumby WH, Pollack I. Visual Contribution to Speech Intelligibility in Noise. J Acoust Soc Am. 1954;26: 212–215. 10.1121/1.1907309 - DOI
1. van Wassenhove V, Grant KW, Poeppel D. Visual speech speeds up the neural processing of auditory speech. Proc Natl Acad Sci. 2005;102: 1181–1186. 10.1073/pnas.0408949102 - DOI - PMC - PubMed
1. McGurk H, MacDonald J. Hearing lips and seeing voices. Nature. 1976;264: 746 10.1038/264746a0 - DOI - PubMed
1. Alsius A, Navarra J, Campbell R, Soto-Faraco S. Audiovisual Integration of Speech Falters under High Attention Demands. Curr Biol. 2005;15: 839–843. 10.1016/j.cub.2005.03.046 - DOI - PubMed
1. Grant KW, Seitz PF. Measures of auditory-visual integration in nonsense syllables and sentences. J Acoust Soc Am. 1998;104: 2438–2450. 10.1121/1.423751 - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Speech-specific audiovisual integration modulates induced theta-band oscillations

Affiliations

Speech-specific audiovisual integration modulates induced theta-band oscillations

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources