Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 1;35(8):bhaf208.
doi: 10.1093/cercor/bhaf208.

Decoding semantic sound categories in early visual cortex

Affiliations

Decoding semantic sound categories in early visual cortex

Giusi Pollicina et al. Cereb Cortex. .

Abstract

Early visual cortex, once thought to be exclusively used for visual processes, has been shown to represent auditory information in the absence of visual stimulation. However, the exact information content of these representations is still unclear, as is their degree of specificity. Here, we acquired functional magnetic resonance imaging (fMRI) data while blindfolded human participants listened to 36 natural sounds, hierarchically organized into semantic categories. Multivoxel pattern analysis revealed that animate and inanimate sounds, as well as human, animal, vehicle, and object sounds could be decoded from fMRI activity patterns in early visual regions V1, V2, and V3. Further, pairwise classification of the different sound categories demonstrated that sounds produced by humans were represented in early visual cortex more distinctively than other semantic categories. Whole-brain searchlight analysis showed that sounds could be decoded also in higher level visual and multisensory brain regions. Our findings extend our understanding of early visual cortex function beyond visual feature processing and show that semantic and categorical sound information is represented in early visual cortex, potentially used to predict visual input.

Keywords: MVPA; audition; early visual cortex; fMRI; multisensory interaction.

PubMed Disclaimer

Conflict of interest statement

No conflict of interest declared.

Figures

Fig. 1
Fig. 1
fMRI experimental design and procedure. Participants were blindfolded and attentively listened to 36 different natural sounds, subdivided into several semantic categories, while detecting an occasional target tone present in 10% of the trials. For retinotopic polar mapping of early visual cortex regions, participants watched a flickering rotating wedge and detected a color change at the center. We mapped V1, V2, and V3 in each participant using retinotopic activity maps projected onto individually reconstructed cortical surfaces. The lower right panel shows the right hemisphere retinotopic map of one example participant.
Fig. 2
Fig. 2
MVPA classification accuracy for (a) animate versus inanimate sounds; (b) humans, animals, vehicles, and objects sounds; (c) the 12 subcategories; and (d) individual pairs of sound categories. Significant above chance classification was derived from permutation analyses across 1,000 iterations of randomized label classification. Solid horizontal lines denote theoretical chance level indistinguishable from permutation derived empirical chance level. Given separate classifications and permutation analyses in a, b, c, and d and each ROI, results were not corrected for multiple comparisons. The bracket over auditory cortex indicates a main effect across all classification pairs. Dots represent individual participants’ data, error bars indicate SEM. *  P < 0.05; ** = P < 0.01; ***  P = 0.001.
Fig. 3
Fig. 3
Results of the whole-brain searchlight analysis for (a) the animate–inanimate classification; and (b) the human–animal–vehicle–object classification, cluster threshold corrected. Displayed on an inflated MNI template cortical surface reconstruction.
Fig. 4
Fig. 4
GLM analysis results of the inanimate > animate contrast projected onto an inflated MNI template cortical surface reconstruction with P = 0.01 (FDR corrected), with warm colors representing significant activation for inanimate sounds and cold colors indicating significant activation for animate sounds.
Fig. 5
Fig. 5
GLM analyses results of significant activations for each of the four categories (human, animal, vehicle, and object) against the other three categories (color coded as per legend). Contrasts are projected onto an inflated MNI template cortical surface reconstruction with P = 0.01 (FDR corrected).
Fig. 6
Fig. 6
Univariate ROI activity levels for the (a) two-way, and (b) four-way categorization. Mean beta values are plotted for each sound condition in V1, V2, and V3, relative to baseline. Error bars indicate SEM.
Fig. 7
Fig. 7
Individual scores from the vividness of visual imagery questionnaire plotted as a function of sound classification accuracy for the four-category classification in V3. Note that high VVIQ scores denote less vivid visual imagery.

References

    1. Amedi A, Malach R, Hendler T, Peled S, Zohary E. 2001. Visuo-haptic object-related activation in the ventral visual pathway. Nat Neurosci. 4:324–330. 10.1038/85201. - DOI - PubMed
    1. Amedi A, von Kriegstein K, van Atteveldt NM, Beauchamp MS, Naumer MJ. 2005. Functional imaging of human crossmodal identification and object recognition. Exp Brain Res. 166:559–571. 10.1007/s00221-005-2396-5. - DOI - PubMed
    1. Amedi A et al. 2007. Shape conveyed by visual-to-auditory sensory substitution activates the lateral occipital complex. Nat Neurosci. 10:687–689. 10.1038/nn1912. - DOI - PubMed
    1. Bar M. 2007. The proactive brain: using analogies and associations to generate predictions. Trends Cogn Sci. 11:280–289. 10.1016/j.tics.2007.05.005. - DOI - PubMed
    1. Beauchamp MS, Lee KE, Haxby JV, Martin A. 2002. Parallel visual motion processing streams for manipulable objects and human movements. Neuron. 34:149–159. 10.1016/S0896-6273(02)00642-6. - DOI - PubMed

LinkOut - more resources