Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct-Dec;38(7-8):468-489.
doi: 10.1080/02643294.2022.2085085. Epub 2022 Jun 21.

Cochlea to categories: The spatiotemporal dynamics of semantic auditory representations

Affiliations

Cochlea to categories: The spatiotemporal dynamics of semantic auditory representations

Matthew X Lowe et al. Cogn Neuropsychol. 2021 Oct-Dec.

Abstract

How does the auditory system categorize natural sounds? Here we apply multimodal neuroimaging to illustrate the progression from acoustic to semantically dominated representations. Combining magnetoencephalographic (MEG) and functional magnetic resonance imaging (fMRI) scans of observers listening to naturalistic sounds, we found superior temporal responses beginning ∼55 ms post-stimulus onset, spreading to extratemporal cortices by ∼100 ms. Early regions were distinguished less by onset/peak latency than by functional properties and overall temporal response profiles. Early acoustically-dominated representations trended systematically toward category dominance over time (after ∼200 ms) and space (beyond primary cortex). Semantic category representation was spatially specific: Vocalizations were preferentially distinguished in frontotemporal voice-selective regions and the fusiform; scenes and objects were distinguished in parahippocampal and medial place areas. Our results are consistent with real-world events coded via an extended auditory processing hierarchy, in which acoustic representations rapidly enter multiple streams specialized by category, including areas typically considered visual cortex.

Keywords: Audition; functional magnetic resonance imaging; magnetoencephalography; naturalistic sounds; representational similarity analysis.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Spatiotemporal propagation of auditory neural responses indexed by MEG-fMRI fusion.
A. Overview of MEG-fMRI decoding and RSA-based fusion. i) Brain responses acquired in separate sessions per subject were extracted for MVPA by time points t (whole-brain MEG sensor values, top) or anatomical regions r (fMRI voxelwise t-values, bottom); r may represent successive searchlight volumes, as illustrated here, or anatomical regions of interest, as shown in Figs. 2 and 3. ii) RDMs computed using SVM decoding accuracy (MEG, top) or Pearson correlation distance (fMRI, bottom) to index pairwise dissimilarity between stimulus conditions. iii) RDMs B. Main cortical regions involved in the searchlight-based fMRI-MEG fusion. Illustration on two axial slices (z = 10 and −10 mm) at select time points. There are significant neural representations beginning over the auditory cortex and spreading towards pre-frontal, ventral and medial regions. Color-coded voxels indicate the strength of MEG-fMRI RDMs correlations (Spearman’s ⍴, scaled between 0 and maximal observed value, n = 16, cluster definition threshold p < 0.001, cluster threshold p < 0.05). See the full-brain fusion in Movie S1.
Figure 2.
Figure 2.. ROI-based MEG-fMRI fusion time courses.
A. Eleven regions of interest (ROIs) spanning temporal, frontal, and occipital lobes were selected to examine auditory responses, including the primary auditory cortex (PAC), TE1.2, planum temporale (PT), planum polare (PP), temporal voice area (TVAx), left inferior frontal gyrus (LIFG), fusiform face area (FFA), parahippocampal place area (PPA), medial place area (MPA), lateral occipital area (LOC), and early visual cortex (EVC). B. ROI-specific fusion time courses (color-coded to match ROI illustrations in A) indexing correlation between whole-brain MEG and a single RDM representing each fMRI ROI. Gray vertical line denotes stimulus onset at t=0 ms. Solid color-coded bars below time courses represent significant time points, observed for all regions except EVC; black triangles indicate respective peak latencies. Inset magnifies 0–500 ms regime (the duration of the stimulus) for clarity. All statistics, P<0.01, C<0.05, 1000 permutations. C. Visualization of category and exemplar selectivity patterns across ROIs. RDMs and their corresponding multidimensional scaling visualization (in the first two dimensions) in the 11 fMRI ROIs. Each of the 80 stimuli is color-coded by category, as shown in the lower right panel. Dissimilarity scale normalized to maximum within each ROI.
Figure 3:
Figure 3:. Emerging category vs. acoustic selectivity across time and ROIs.
A. Temporal dynamics in MEG responses: Cochleagram (blue) and Category (red) model RDMs shown in insets encode hypothesized similarity structures of acoustic and semantic neural stimulus representations. Pairwise Cochleagram model dissimilarities are indexed by the euclidean distance between stimulus log-frequency spectra; for the Category model, all between-category pairs were assigned a value of 1 and within-category pairs a value of 0. Plot shows Fisher z-normalized correlation time courses between models and whole-brain MEG data. Cochleagram and Category peaks at 127 (CI 119–178) ms and 217 (CI 209–259) ms, respectively. Yellow trace is Semantic Dominance (SD) difference score, Δz = zCategoryzCochleagram. Negative SD = acoustically dominated responses; positive SD = semantic category-dominated responses. Color-coded bars indicate significant temporal clusters for each trace (dark bar is significantly negative SD, i.e. significantly stronger Cochleagram vs. Category model correlation). B. Model correlations with RDMs computed from fMRI ROI data. Color code same as in A. Bar plots depict the subject-averaged measure in each ROI; error bars = standard error of the mean (SEM). Numbered gray-shaded partitions indicate ranks assigned to ROIs according to anatomical position. Color-matched asterisks indicate significantly positive correlation; black horizontal bars and bisecting asterisks indicate significant correlation differences between pairs of STG ROIs. C. Histogram of 20,000 bootstrapped ROI-SD trend rank correlations compared with empirical correlations from our anatomical (magenta; 99.5%ile) and fusion-derived (cyan; 98.3%ile) ROI rank assignments. D. Semantic subcategory distinctions in fMRI ROIs. Subject-averaged bar plots indicate distinction between category animacy (purple), human vs. animal vocalizations (orange), and objects vs. scenes (green). Plotting conventions and statistics same as in B. Black horizontal bars and bisecting asterisks indicate significant correlation differences between models within STG/IFG ROIs. Bonferroni corrected one-tailed t-tests, asterisks indicate significance at p < 0.05.

Similar articles

Cited by

References

    1. Ahveninen J, Jaaskelainen IP, Raij T, Bonmassar G, Devore S, Hamalainen M, Levanen S, Lin F-H, Sams M, Shinn-Cunningham BG, Witzel T, Belliveau JW (2006) Task-modulated “what” and “where” pathways in human auditory cortex. Proceedings of the National Academy of Sciences 103:14608–14613 Available at: 10.1073/pnas.0510480103. - DOI - PMC - PubMed
    1. Ahveninen J, Kopčo N, Jääskeläinen IP (2014) Psychophysics and neuronal bases of sound localization in humans. Hearing Research 307:86–97 Available at: 10.1016/j.heares.2013.07.008. - DOI - PMC - PubMed
    1. Belin P, Zatorre RJ, Lafaille P, Ahad P, Pike B (2000) Voice-selective areas in human auditory cortex. Nature 403:309–312. - PubMed
    1. Bizley JK, Cohen YE (2013) The what, where and how of auditory-object perception. Nat Rev Neurosci 14:693–707. - PMC - PubMed
    1. Blank H, Anwander A, von Kriegstein K (2011) Direct Structural Connections between Voice- and Face-Recognition Areas. Journal of Neuroscience 31:12906–12915 Available at: 10.1523/jneurosci.2091-11.2011. - DOI - PMC - PubMed

Publication types

LinkOut - more resources