Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr 1:15:616049.
doi: 10.3389/fnhum.2021.616049. eCollection 2021.

Rethinking the Mechanisms Underlying the McGurk Illusion

Affiliations

Rethinking the Mechanisms Underlying the McGurk Illusion

Mariel G Gonzales et al. Front Hum Neurosci. .

Abstract

The McGurk illusion occurs when listeners hear an illusory percept (i.e., "da"), resulting from mismatched pairings of audiovisual (AV) speech stimuli (i.e., auditory/ba/paired with visual/ga/). Hearing a third percept-distinct from both the auditory and visual input-has been used as evidence of AV fusion. We examined whether the McGurk illusion is instead driven by visual dominance, whereby the third percept, e.g., "da," represents a default percept for visemes with an ambiguous place of articulation (POA), like/ga/. Participants watched videos of a talker uttering various consonant vowels (CVs) with (AV) and without (V-only) audios of/ba/. Individuals transcribed the CV they saw (V-only) or heard (AV). In the V-only condition, individuals predominantly saw "da"/"ta" when viewing CVs with indiscernible POAs. Likewise, in the AV condition, upon perceiving an illusion, they predominantly heard "da"/"ta" for CVs with indiscernible POAs. The illusion was stronger in individuals who exhibited weak/ba/auditory encoding (examined using a control auditory-only task). In Experiment2, we attempted to replicate these findings using stimuli recorded from a different talker. The V-only results were not replicated, but again individuals predominately heard "da"/"ta"/"tha" as an illusory percept for various AV combinations, and the illusion was stronger in individuals who exhibited weak/ba/auditory encoding. These results demonstrate that when visual CVs with indiscernible POAs are paired with a weakly encoded auditory/ba/, listeners default to hearing "da"/"ta"/"tha"-thus, tempering the AV fusion account, and favoring a default mechanism triggered when both AV stimuli are ambiguous.

Keywords: McGurk illusion; audiovisual fusion; cross-modal phonetic encoding; multisensory integration; phonemic representations.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Gray-scale matrix depicting percentages of response percepts to visual-only (V-only) CV utterances for Experiment1 (A) and Experiment2 (B). The percentage for each percept was calculated as the percentage of responses of that percept relative to all other responses within a stimulus type (i.e., for each viseme). (C) For the visual-only (V-only) CV utterances in Experiment1 (left) and Experiment2 (right), the percentage of responses for the “da”/“ta” or “ga”/“ka” percept relative to all other responses across all visemes, except/ba/. The boxplots indicate the median (the horizontal line inside of the box), the 25 and 75th percentiles (the box’s bottom and top edges, respectively), and the whiskers indicate the range of the individual data points. (Note that there were no outliers).
FIGURE 2
FIGURE 2
Gray-scale matrix depicting percentages of percepts in response to congruent/ba/AV utterances and incongruent AV utterances of various visual CVs combined with audio/ba/for Experiment1 (A) and Experiment2 (B). The percentage for each percept was calculated as the percentage of responses of the percept relative to all other responses within a stimulus type.
FIGURE 3
FIGURE 3
Plots depicting Experiment1 correlations between percentages of the “ba”/“pa” percept of the A-only/ba–da/stimuli (superimposed/ba/and/da/CVs) and the overall response percentages of the “da”/“ta”/“tha” percept (across all stimuli except/ba/) for the V-only condition (A), the AV condition (B), and the mean of the V-only and AV conditions (C).
FIGURE 4
FIGURE 4
Gray-scale matrix depicting percentages of percepts in response to A-only superimposed stimuli for Experiment1 (A) and Experiment2 (B).
FIGURE 5
FIGURE 5
A plot depicting the correlation between percentages of “da”/“ta”/“tha” responses to the A-only/ba–ba/stimuli (superimposed/ba/and/ba/CVs) and the overall response percentages of the “da”/“ta”/“tha” percept during the AV condition. These data are from Experiment1.

Similar articles

Cited by

References

    1. Abbott N. T., Shahin A. J. (2018). Cross-modal phonetic encoding facilitates the McGurk illusion and phonemic restoration. J. Neurophysiol. 120 2988–3000. 10.1152/jn.00262.2018 - DOI - PMC - PubMed
    1. Alsius A., Paré M., Munhall K. G. (2018). Forty years after hearing lips and seeing voices: the mcgurk effect revisited. Multisens. Res. 31 111–144. 10.1163/22134808-00002565 - DOI - PubMed
    1. Andersen T. S. (2015). The early maximum likelihood estimation model of audiovisual integration in speech perception. J. Acoust. Soc. Am. 137 2884–2891. 10.1121/1.4916691 - DOI - PubMed
    1. Andersen T. S., Winther O. (2020). Regularized models ofaudiovisual integration ofspeech with predictive power for sparse behavioral data. J. Math. Psychol. 98:102404. 10.1016/j.jmp.2020.102404 - DOI
    1. Beauchamp M. S., Lee K. E., Argall B. D., Martin A. (2004). Integration of auditory and visual information about objects in superior temporal sulcus. Neuron 41 809–823. 10.1016/s0896-6273(04)00070-4 - DOI - PubMed