Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jan 13:6:18914.
doi: 10.1038/srep18914.

Selective Audiovisual Semantic Integration Enabled by Feature-Selective Attention

Affiliations

Selective Audiovisual Semantic Integration Enabled by Feature-Selective Attention

Yuanqing Li et al. Sci Rep. .

Abstract

An audiovisual object may contain multiple semantic features, such as the gender and emotional features of the speaker. Feature-selective attention and audiovisual semantic integration are two brain functions involved in the recognition of audiovisual objects. Humans often selectively attend to one or several features while ignoring the other features of an audiovisual object. Meanwhile, the human brain integrates semantic information from the visual and auditory modalities. However, how these two brain functions correlate with each other remains to be elucidated. In this functional magnetic resonance imaging (fMRI) study, we explored the neural mechanism by which feature-selective attention modulates audiovisual semantic integration. During the fMRI experiment, the subjects were presented with visual-only, auditory-only, or audiovisual dynamical facial stimuli and performed several feature-selective attention tasks. Our results revealed that a distribution of areas, including heteromodal areas and brain areas encoding attended features, may be involved in audiovisual semantic integration. Through feature-selective attention, the human brain may selectively integrate audiovisual semantic information from attended features by enhancing functional connectivity and thus regulating information flows from heteromodal areas to brain areas encoding the attended features.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Experimental stimuli and time courses.
(A) Four examples of audiovisual stimuli; the red numbers indicate runs with the number task only. (B) Time course of a trial for the runs with the number task, in which the stimuli included randomly presented numbers and videos/audios/movie clips. (C) Time course of a trial for the runs with the gender, emotion, or bi-feature task. For both (B,C), the presentation of a stimulus (video/audio/movie clip) lasted 1,400 ms and was repeated four times during the first eight seconds in a trial. A visual cue (“+”) appeared at the 8th second and persisted for six seconds.
Figure 2
Figure 2. Brain areas for audiovisual sensory integration that met the criterion [AV>max (A,V) (p < 0.05, FWE-corrected)]∩[V>0 or A>0 (p < 0.05, uncorrected)].
(A) No brain areas exhibited audiovisual sensory integration for the number task. (B) Brain areas exhibiting audiovisual sensory integration for the gender task, including the left pSTS/MTG (Talairach coordinates of the cluster center: (−57, −34, −5); cluster size: 76). (C) Brain areas exhibiting audiovisual sensory integration for the emotion task, including the left pSTS/MTG (cluster center: (−60, −40, 1); cluster size: 98) and the right pSTS/MTG (cluster center: (45, −34, 19); cluster size: 13). (D) Brain areas exhibiting audiovisual sensory integration for the bi-feature task, including the left pSTS/MTG (cluster center: (−54, −46, 4); cluster size: 105) and the right pSTS/MTG (cluster center: (61, −45, −7); cluster size: 13). (EH): Percent signal changes evoked by the audiovisual, visual-only and auditory-only stimuli in the bilateral pSTS/MTG activation clusters shown in (A–D), respectively (the percent signal changes in (E) were calculated using the union of all activated voxels shown in (BD).
Figure 3
Figure 3. Reproducibility ratios (means and standard errors across all subjects) and the corresponding comparison results.
Left/Right: gender/emotion categories; the first 3 rows: audiovisual, visual-only, and auditory-only stimulus conditions, respectively; the 4th row: the reproducibility ratio in the audiovisual condition minus the maximum of the reproducibility ratios in the visual-only and auditory-only conditions.
Figure 4
Figure 4. Cross-reproducibility ratios (means and standard errors across all subjects) in the audiovisual stimulus conditions with number, gender and emotion tasks.
Left/Right: gender/emotion categories.
Figure 5
Figure 5. The functional connectivity between the heteromodal areas and the brain areas encoding the gender feature (A) or the emotion feature (B).
Green spheres: brain areas from Table 2 in (A) or Table 3 in (B). Magenta spheres: heteromodal areas. Yellow lines: connections from the heteromodal areas to the informative brain areas. Blue lines: connections from the informative brain areas to the heteromodal areas. Purple lines: connections with bi-direction. Numbers in brackets: total numbers of functional connections.

References

    1. Calvert G. A. & Thesen T. Multisensory integration: methodological approaches and emerging principles in the human brain. J. Physiol. Paris 98, 191–205 (2004). - PubMed
    1. Campanella S. & Belin P. Integrating face and voice in person perception. Trends Cogn. Sci. 11, 535–543 (2007). - PubMed
    1. Schweinberger S. R., Robertson D. & Kaufmann J. M. Hearing facial identities. Q. J. Exp. Psych. 60, 1446–1456 (2007). - PubMed
    1. Bushara K. O. et al. Neural correlates of cross-modal binding. Nat. Neurosci. 6, 190–195 (2003). - PubMed
    1. Macaluso E., Frith C. D. & Driver J. Multisensory stimulation with or without saccades: fMRI evidence for crossmodal effects on sensory-specific cortices that reflect multisensory location-congruence rather than task-relevance. NeuroImage 26, 414–425 (2005). - PubMed

Publication types