. 2021 Nov 18;19(11):e3001465.

doi: 10.1371/journal.pbio.3001465. eCollection 2021 Nov.

Attention controls multisensory perception via two distinct mechanisms at different levels of the cortical hierarchy

Ambra Ferrari^{1

2}, Uta Noppeney^{1

2}

Affiliations

¹ Computational Neuroscience and Cognitive Robotics Centre, University of Birmingham, Birmingham, United Kingdom.
² Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands.

PMID: 34793436
PMCID: PMC8639080
DOI: 10.1371/journal.pbio.3001465

Attention controls multisensory perception via two distinct mechanisms at different levels of the cortical hierarchy

Ambra Ferrari et al. PLoS Biol. 2021.

. 2021 Nov 18;19(11):e3001465.

doi: 10.1371/journal.pbio.3001465. eCollection 2021 Nov.

Authors

Ambra Ferrari^{1

2}, Uta Noppeney^{1

2}

Affiliations

¹ Computational Neuroscience and Cognitive Robotics Centre, University of Birmingham, Birmingham, United Kingdom.
² Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands.

PMID: 34793436
PMCID: PMC8639080
DOI: 10.1371/journal.pbio.3001465

Update in

Multisensory perceptual and causal inference is largely preserved in medicated post-acute individuals with schizophrenia.
Rohe T, Hesse K, Ehlis AC, Noppeney U. Rohe T, et al. PLoS Biol. 2024 Sep 10;22(9):e3002790. doi: 10.1371/journal.pbio.3002790. eCollection 2024 Sep. PLoS Biol. 2024. PMID: 39255328 Free PMC article.

Abstract

To form a percept of the multisensory world, the brain needs to integrate signals from common sources weighted by their reliabilities and segregate those from independent sources. Previously, we have shown that anterior parietal cortices combine sensory signals into representations that take into account the signals' causal structure (i.e., common versus independent sources) and their sensory reliabilities as predicted by Bayesian causal inference. The current study asks to what extent and how attentional mechanisms can actively control how sensory signals are combined for perceptual inference. In a pre- and postcueing paradigm, we presented observers with audiovisual signals at variable spatial disparities. Observers were precued to attend to auditory or visual modalities prior to stimulus presentation and postcued to report their perceived auditory or visual location. Combining psychophysics, functional magnetic resonance imaging (fMRI), and Bayesian modelling, we demonstrate that the brain moulds multisensory inference via two distinct mechanisms. Prestimulus attention to vision enhances the reliability and influence of visual inputs on spatial representations in visual and posterior parietal cortices. Poststimulus report determines how parietal cortices flexibly combine sensory estimates into spatial representations consistent with Bayesian causal inference. Our results show that distinct neural mechanisms control how signals are combined for perceptual inference at different levels of the cortical hierarchy.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Bayesian Causal Inference and the possible roles of attentional control.**
**(a)** Generative models of Forced Fusion and Bayesian Causal Inference. For Forced Fusion, a single source generates auditory and visual signals. Bayesian Causal Inference explicitly models the two causal structures, i.e., whether auditory and visual signals come from one common cause (C = 1) or from separate causes (C = 2). **(b)** During perceptual inference, the observer is thought to invert the generative models; it infers the number of sources by combining prior knowledge and audiovisual evidence. A Forced Fusion estimate is computed by averaging auditory and visual estimates alone with prior spatial estimates weighted by their relative reliabilities (inverse sensory variance σ²). The full segregation estimates, visual or auditory, are computed separately. To account for causal uncertainty, the final Bayesian Causal Inference estimate, auditory ( ${\hat{S}}_{A}$ ) or visual ( ${\hat{S}}_{V}$ ), is computed by combining the audiovisual Forced Fusion estimate ( ${\hat{S}}_{A V, C = 1}$ ) with the task-relevant full segregation estimate, auditory ( ${\hat{S}}_{A, C = 2}$ ) or visual ( ${\hat{S}}_{V, C = 2}$ ), each weighted by the posterior probabilities of a common (C = 1) or independent (C = 2) causes. **(c)** Attentional control can mould multisensory perceptual inference via two distinct mechanisms and thereby induce differences in observers’ auditory and visual estimates. First, attending to a particular sensory modality may enhance the reliability of the signals in the attended sensory modality and thereby their weights during Forced Fusion. Second, modality-specific report (i.e., task relevance) determines the late readout consistent with the principles of Bayesian Causal Inference, i.e., whether the Forced Fusion estimate is combined with the auditory or visual full segregation estimate.

**Fig 2. Experimental design and procedure, neuroimaging univariate results, and response times in the fMRI experiment.**
**(a)** The experiment conformed to a 3 (auditory location) × 3 (visual location) × 2 (prestimulus attention: attA, attV) × 2 (poststimulus report: repA, repV) factorial design (A for auditory and V for visual). Auditory and visual signals were independently sampled from 3 locations along the azimuth (−9°, 0°, and 9° visual angle), resulting in 9 audiovisual spatial combinations with 3 levels of spatial disparity: none (0°; dark grey); low (9°; mid grey); and high (18°; light grey). The orthogonal pre- and postcue attention cueing paradigm resulted in two valid (attArepA; attVrepV) and two invalid (attVrepA; attArepV) conditions. **(b)** Prior to block start, participants were cued to attend to either the auditory or visual signal (via colour of fixation cross); 350 ms after each audiovisual stimulus, they were cued to report their perceived auditory or visual location (via coloured letter: A for auditory and V for visual). Participants responded via a button press using different keypads for each sensory modality. **(c)** Increased activations for invalid relative to valid trials [Invalid (attVrepA & attArepV) > Valid (attArepA & attVrepV)] in blue, for AV spatially incongruent relative congruent stimuli [AVincongruent (AV disparity ≠ 0°) > AVcongruent (AV disparity = 0°)] in red and their overlap in pink, rendered on an inflated canonical brain (p < 0.001 uncorrected at peak level for visualisation purposes, extent threshold k > 0 voxels). **(d)** Across participants’ mean (±SEM) parameter estimates in arbitrary units from L SFG (x = −4, y = 8, and z = 52) and L ACC (x = −10, y = 18, and z = 32). **(e)** Across participants’ mean (±SEM) response times. Data in d and e plotted as a function of (i) prestimulus attention: auditory attA versus visual attV; (ii) poststimulus report: auditory repA versus visual repV; and (iii) audiovisual spatial (in)congruency: AVincongruent (AV disparity ≠ 0°) versus AVcongruent (AV disparity = 0°). The data used to make this figure are available in S1 and S2 Datas. ACC, anterior cingulate cortex; AIns, anterior insula; IFG, inferior frontal gyrus; IPS, intraparietal sulcus; L ACC, left anterior cingulate gyrus; L SFG, left superior temporal gyrus; SFG, superior temporal gyrus; SPL, superior parietal lobule.

**Fig 3. Audiovisual weight index (w_AV) and Bayesian modelling results for the fMRI experiment.**
**(a)** Across participants’ mean w_AV (±SEM) shown as a function of (i) prestimulus attention: auditory attA versus visual attV; (ii) poststimulus report: auditory repA versus visual repV; and (iii) AV spatial disparity: low dispL (9°) versus high dispH (18°). w_AV = 1 for purely visual and w_AV = 0 for purely auditory influence. **(b)** Along the first factor of a 2 × 3 factorial model space, we assessed the influence of prestimulus attention by comparing whether the sensory variances were (i) constant (fixed: $σ_{A a t t A}^{2}$ = $σ_{A a t t V}^{2}$ , $σ_{V a t t A}^{2}$ = $σ_{V a t t V}^{2}$ ); or (ii) different (free: $σ_{A a t t A}^{2}$ , $σ_{A a t t V}^{2}$ , $σ_{V a t t A}^{2}$ , $σ_{V a t t V}^{2}$ ) across prestimulus attention. Along the second factor, we assessed the influence of poststimulus report by comparing (i) a forced fusion model in which the sensory variances were fixed (FF fixed: $σ_{A r e p A}^{2}$ = $σ_{A r e p V}^{2}$ , $σ_{V r e p A}^{2}$ = $σ_{V r e p V}^{2}$ ); (ii) a forced fusion model in which the sensory variances were allowed to differ between auditory and visual report (FF free: $σ_{A r e p A}^{2}$ , $σ_{A r e p V}^{2}$ , $σ_{V r e p A}^{2}$ , $σ_{V r e p V}^{2}$ ); and (iii) a BCI model in which the influence of poststimulus report arises via a late flexible readout. The matrix represents our 2 × 3 model space. For each model, we show the pEP (larger pEP represents better model) via greyscale. BOR represents the probability that results are due to chance. **(c)** Across participants’ mean (±SEM) of auditory and visual noise parameter estimates (i.e., $σ_{A a t t A}^{2}$ , $σ_{A a t t V}^{2}$ , $σ_{V a t t A}^{2}$ , $σ_{V a t t V}^{2}$ ) of the best model, i.e., BCI model with free prestimulus attention parameters (attA, auditory; attV, visual). p-Values based on one-tailed sign permutation test. The data used to make this figure are available in S2 Data. BCI, Bayesian causal inference; BOR, Bayesian omnibus risk; FF, Forced Fusion; pEP, protected exceedance probability.

**Fig 4. Neural audiovisual weight index (nw_AV) across the audiovisual processing hierarchy.**
**(a)** fMRI voxel response patterns were obtained from anatomical ROIs along the visual and auditory dorsal cortical hierarchies: V1-3 (blue), pIPS (cyan), aIPS (green), and hA (orange). ROIs are displayed on a canonical brain. **(b)** An SVR model was trained to learn the mapping from the fMRI voxel response patterns to the external spatial locations based on the audiovisual spatially congruent trials (green cells = congruent). The learnt mapping was then used to decode the spatial location from the fMRI voxel response patterns of the audiovisual spatially incongruent trials (orange cells = incongruent) to compute nw_AV. **(c)** Across participants’ mean nw_AV (±SEM) shown as a function of (i) prestimulus attention (Att): auditory/attA versus visual/attV; and (ii) poststimulus report (Rep): auditory/repA versus visual/repV, with statistical results of sign permutation tests. nw_AV = 1 for purely visual and nw_AV = 0 for purely auditory influence. The data used to make this figure are available in S2 Data. ** p < 0.01, * p < 0.05. aIPS, anterior intraparietal sulcus; hA, higher-order auditory cortex; pIPS, posterior intraparietal sulcus; ROI, region of interest; SVR, support vector regression; V1-3, low-level visual cortex.

See this image and copyright information in PMC

Update of

Cortical hierarchies perform Bayesian causal inference in multisensory perception.
Rohe T, Noppeney U. Rohe T, et al. PLoS Biol. 2015 Feb 24;13(2):e1002073. doi: 10.1371/journal.pbio.1002073. eCollection 2015 Feb. PLoS Biol. 2015. Update in: PLoS Biol. 2021 Nov 18;19(11):e3001465. doi: 10.1371/journal.pbio.3001465. Update in: PLoS Biol. 2024 Sep 10;22(9):e3002790. doi: 10.1371/journal.pbio.3002790. PMID: 25710328 Free PMC article. Updated.
To integrate or not to integrate: Temporal dynamics of hierarchical Bayesian causal inference.
Aller M, Noppeney U. Aller M, et al. PLoS Biol. 2019 Apr 2;17(4):e3000210. doi: 10.1371/journal.pbio.3000210. eCollection 2019 Apr. PLoS Biol. 2019. Update in: PLoS Biol. 2021 Nov 18;19(11):e3001465. doi: 10.1371/journal.pbio.3001465. Update in: PLoS Biol. 2024 Sep 10;22(9):e3002790. doi: 10.1371/journal.pbio.3002790. PMID: 30939128 Free PMC article. Updated.

References

1. Alais D, Burr D. The ventriloquist effect results from near-optimal bimodal integration. Curr Biol. 2004;14(3):257–62. doi: 10.1016/j.cub.2004.01.029 - DOI - PubMed
1. Ernst MO, Banks MS. Humans integrate visual and haptic information in a statistically optimal fashion. Nature. 2002;415(6870):429–33. doi: 10.1038/415429a - DOI - PubMed
1. Ernst MO, Bülthoff HH. Merging the senses into a robust percept. Trends Cogn Sci. 2004;8(4):162–9. doi: 10.1016/j.tics.2004.02.002 - DOI - PubMed
1. Fetsch CR, Pouget A, DeAngelis GC, Angelaki DE. Neural correlates of reliability-based cue weighting during multisensory integration. Nat Neurosci. 2012;15(1):146–54. - PMC - PubMed
1. Fetsch CR, DeAngelis GC, Angelaki DE. Bridging the gap between theories of sensory cue integration and the physiology of multisensory neurons. Nat Rev Neurosci. 2013;14(6):429–42. doi: 10.1038/nrn3503 - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Attention controls multisensory perception via two distinct mechanisms at different levels of the cortical hierarchy

Affiliations

Attention controls multisensory perception via two distinct mechanisms at different levels of the cortical hierarchy

Authors

Affiliations

Update in

Abstract

Conflict of interest statement

Figures

Update of

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources