Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Oct 28;29(43):13445-53.
doi: 10.1523/JNEUROSCI.3194-09.2009.

Dual neural routing of visual facilitation in speech processing

Affiliations

Dual neural routing of visual facilitation in speech processing

Luc H Arnal et al. J Neurosci. .

Abstract

Viewing our interlocutor facilitates speech perception, unlike for instance when we telephone. Several neural routes and mechanisms could account for this phenomenon. Using magnetoencephalography, we show that when seeing the interlocutor, latencies of auditory responses (M100) are the shorter the more predictable speech is from visual input, whether the auditory signal was congruent or not. Incongruence of auditory and visual input affected auditory responses approximately 20 ms after latency shortening was detected, indicating that initial content-dependent auditory facilitation by vision is followed by a feedback signal that reflects the error between expected and received auditory input (prediction error). We then used functional magnetic resonance imaging and confirmed that distinct routes of visual information to auditory processing underlie these two functional mechanisms. Functional connectivity between visual motion and auditory areas depended on the degree of visual predictability, whereas connectivity between the superior temporal sulcus and both auditory and visual motion areas was driven by audiovisual (AV) incongruence. These results establish two distinct mechanisms by which the brain uses potentially predictive visual information to improve auditory perception. A fast direct corticocortical pathway conveys visual motion parameters to auditory cortex, and a slower and indirect feedback pathway signals the error between visual prediction and auditory input.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Neuroanatomical model of auditory facilitation by concurrent visual input and related predictions. A, Two anatomical pathways are proposed for the routing of visual information (green arrows) to auditory areas (A) (red arrows representing routing of auditory information): pathway (1) is a direct corticocortical pathway from visual cortices (V) and pathway (2) is a feedback pathway from multisensory STS. B, Time course of evoked components for auditory (red), visual (green), and AV (blue) stimuli. Neuronal facilitation is assessed by measuring amplitude reduction and latency anticipation of the M100A peak in the AV–V versus A conditions. C, Predictions on the origin of M100A facilitation as a function of (1) viseme dependency and (2) mismatch when audio and visual syllables are not congruent.
Figure 2.
Figure 2.
Predictability of syllables /ga/, /ta/, /la/, /pa/, and /ja/, presented visually (V) and ordered by increasing predictability. The predictive power of five visual syllables was assessed by measuring recognition rates in 15 subjects. Error bars indicate SEM. **p < 0.01, ***p < 0.001.
Figure 3.
Figure 3.
Facilitation of early auditory response by visual input. A, Auditory evoked response (M100A) latency (A, dark bar) was globally reduced by visual syllables whether they matched the sound (AVc, gray bar) or not (AVi, white bar). B, M100A latency reduction [A-(AV–V)], represented as a function of visual predictability (Fig. 2), shows a significant viseme dependency but no effect of incongruence. M100 latency reduction is proportional to visual predictability in both AVc (black dashed line) and AVi (gray dashed line) combinations. No significant difference between AVc and AVi regression slopes was found. C, M100A amplitude change (positive values correspond to a reduction of M100A in AV–V vs A condition) indicates a significant effect of syllables but no viseme dependency or incongruence effect. D, Perceived incongruence for AVc and AVi combinations. Note that comparisons focus on the visual syllable (for example PaAVc is compared with PaAVi, e.g., PaV/GaA) (supplemental Fig. 1B, available at www.jneurosci.org as supplemental material). Perceived incongruence for AVi pairs correlates positively with visual predictability (gray dashed line), whereas perceived incongruence for AVc pairs correlates negatively with visual predictability (black dashed line, interaction significant). Error bars indicate SEM. *p < 0.05, **p < 0.01, ***p < 0.001.
Figure 4.
Figure 4.
Effect of incongruence on ERFs across time. A, Scalp topographies within the four time windows in which neural incongruence effect was detected (paired t test; grand average of AVi vs AVc conditions, with the overall sum of stimuli in AVi and AVc conditions physically the same) (supplemental Fig. 1A, available at www.jneurosci.org as supplemental material). B, Effect of incongruence on viseme dependency of neural response amplitude, tested across those five selected sensors (black dots on topographies) showing a maximal effect, within the two extreme time windows. Dark and light gray dashed lines represent the correlations between amplitude and predictability in AVc and AVi conditions respectively. C, Parallel between neural responses and behavioral reports related to incongruence. Left axis (dark line) indicates incongruence-by-viseme interaction F values (significant for the last 2 time windows) at the ERF level. The gray line shows that correlation values (Pearson's r, right axis) between ERF amplitude differences and perceived incongruence difference for each AVi versus AVc pair also increases over time. Error bars indicate SEM. n.s., Nonsignificant effect. *p < 0.05, ***p < 0.001.
Figure 5.
Figure 5.
Surface renderings of MEG sources and fMRI activations. A, Source reconstruction of M170 peak measured in response to the viseme /pa/ shows that early activity related to lips movements emerges in the temporo-occipital cortex (visual motion cortex, as separately assessed by functional localizer). B, Summary of fMRI findings: parametric increase with syllable visual predictability (green blob) overlaps with the sources of M170 shown in A. Functional connectivity was assessed using PPI with visual syllable recognition rates as the psychological variable. There was a parametric increase of functional connectivity between visual motion cortex and auditory regions surrounding Heschl's gyrus (red blobs). The middle STS showed the opposite effect, i.e., a decrease of functional connectivity as a function of visual predictability (yellow blob and blue blob) when using both visual motion and auditory cortices as seed regions. C, Activity in STS also reflects the amount of prediction error, showing a signal increase for incongruent stimuli (white squares) and a signal decrease for congruent stimuli (gray squares), proportionally to visual predictability. *p < 0.05, **p < 0.01.

Similar articles

Cited by

References

    1. Barraclough NE, Xiao D, Baker CI, Oram MW, Perrett DI. Integration of visual and auditory information by superior temporal sulcus neurons responsive to the sight of actions. J Cogn Neurosci. 2005;17:377–391. - PubMed
    1. Beauchamp MS, Lee KE, Argall BD, Martin A. Integration of auditory and visual information about objects in superior temporal sulcus. Neuron. 2004a;41:809–823. - PubMed
    1. Beauchamp MS, Argall BD, Bodurka J, Duyn JH, Martin A. Unraveling multisensory integration: patchy organization within human STS multisensory cortex. Nat Neurosci. 2004b;7:1190–1192. - PubMed
    1. Bernstein LE, Lu ZL, Jiang J. Quantified acoustic-optical speech signal incongruity identifies cortical sites of audiovisual speech processing. Brain Res. 2008a;1242:172–184. - PMC - PubMed
    1. Bernstein LE, Auer ET, Jr, Wagner M, Ponton CW. Spatiotemporal dynamics of audiovisual speech processing. Neuroimage. 2008b;39:423–435. - PMC - PubMed

Publication types

LinkOut - more resources