. 2018 Feb 28;8(1):3752.

doi: 10.1038/s41598-018-22160-9.

Deep Residual Network Predicts Cortical Representation and Organization of Visual Features for Rapid Categorization

Haiguang Wen^{1

2}, Junxing Shi^{1

2}, Wei Chen³, Zhongming Liu^{4

5

6}

Affiliations

¹ School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA.
² Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, USA.
³ Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota Medical School, Minneapolis, MN, USA.
⁴ Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN, USA. zmliu@purdue.edu.
⁵ School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA. zmliu@purdue.edu.
⁶ Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, USA. zmliu@purdue.edu.

PMID: 29491405
PMCID: PMC5830584
DOI: 10.1038/s41598-018-22160-9

Deep Residual Network Predicts Cortical Representation and Organization of Visual Features for Rapid Categorization

Haiguang Wen et al. Sci Rep. 2018.

. 2018 Feb 28;8(1):3752.

doi: 10.1038/s41598-018-22160-9.

Authors

Haiguang Wen^{1

2}, Junxing Shi^{1

2}, Wei Chen³, Zhongming Liu^{4

5

6}

Affiliations

¹ School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA.
² Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, USA.
³ Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota Medical School, Minneapolis, MN, USA.
⁴ Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN, USA. zmliu@purdue.edu.
⁵ School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA. zmliu@purdue.edu.
⁶ Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, USA. zmliu@purdue.edu.

PMID: 29491405
PMCID: PMC5830584
DOI: 10.1038/s41598-018-22160-9

Abstract

The brain represents visual objects with topographic cortical patterns. To address how distributed visual representations enable object categorization, we established predictive encoding models based on a deep residual network, and trained them to predict cortical responses to natural movies. Using this predictive model, we mapped human cortical representations to 64,000 visual objects from 80 categories with high throughput and accuracy. Such representations covered both the ventral and dorsal pathways, reflected multiple levels of object features, and preserved semantic relationships between categories. In the entire visual cortex, object representations were organized into three clusters of categories: biological objects, non-biological objects, and background scenes. In a finer scale specific to each cluster, object representations revealed sub-clusters for further categorization. Such hierarchical clustering of category representations was mostly contributed by cortical representations of object features from middle to high levels. In summary, this study demonstrates a useful computational strategy to characterize the cortical organization and representations of visual features for rapid categorization.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Figure 1**
DNN-based Voxel-wise encoding models. **(a)** Performance of ResNet-based encoding models in predicting the cortical responses to novel testing movies for three subjects. The accuracy is measured by the average Pearson’s correlation coefficient (r) between the predicted and observed fMRI responses across five testing movies (q < 0.01 after correction for multiple comparison using the false discovery rate (FDR) method, and with threshold r > 0.2). The prediction accuracy is displayed on both flat (top) and inflated (bottom left) cortical surfaces for Subject 1. **(b)** Explained variance of the cortical response to testing movie by the layer-specific visual features in ResNet. The right shows the index to the ResNet layer that best explains the cortical response at every voxel. **(c)** Comparison between the ResNet-based and the AlexNet-based encoding models. Each bar represents the mean ± SE of the prediction accuracy (normalized by the noise ceiling, i.e. dividing prediction accuracy (r) by the noise ceiling at every voxel) within a ROI across voxels and subjects, and * indicates significance (p < 0.001) with paired t-test.

**Figure 2**
Human-face representations with encoding models and functional localizer. **(a)** Model-simulated representation of human face from ResNet-based encoding models. The representation is displayed on both inflated (top) and flat (bottom) cortical surfaces. **(b)** Face vs. non-face contrast map obtained with a face localizer experiment shows regions selective for human faces, including occipital face area (OFA), fusiform face area (FFA), and posterior superior temporal sulcus (pSTS).

**Figure 3**
Cortical representations of 80 object categories. Each panel shows the representation map of an object category on flat cortical surface from Subject 1. The category label is on top left. The color bar shows the cortical response. Each map covers the same extent on the cortex as shown in Fig. 2a, bottom.

**Figure 4**
Category-selectivity at individual cortical locations. **(a)** The category-selectivity across the cortical surface. **(b)** The category-selectivity profile of example cortical locations. For each location, top 10 categories with the highest responses are showed in a descending order. **(c)** Category-selectivity within ROIs (mean ± SE) in the early visual areas (red), ventral stream areas (green), and dorsal stream areas (blue).

**Figure 5**
Categorical similarity and clustering in cortical representation at the scale of the entire visual cortex. **(a)** The left is the (inter-category) similarity matrix (Pearson’s correlation r) of cortical representation. Each element represents the cortical similarity between a pair of categories averaged across subjects (see individual results in Supplementary Fig. S2). It is well separated into three clusters with modularity Q = 0.35. The middle is the (inter-category) similarity matrix of semantic meaning (measured by LCH). The right is the Pearson’s correlation between the inter-category cortical similarity and the inter-category semantic similarity (with three different measures, i.e. the LCH similarity, the word2vec similarity, and the GloVe similarity). **(b)** The three clusters of cortical representation are related to three superordinate-level categories: non-biological objects, biological objects, and background scenes. The average cortical representations across categories within each cluster are shown on both inflated and flattened cortical surfaces.

**Figure 6**
Contributions of different levels of visual features to the similarity and modularity in cortical representation. **(a)** The left shows the inter-category similarity of cortical representations contributed by layer-wise category information ranging from the lowest (layer 1) to highest (layer 50) layer. The order of categories is the same as in Fig. 6a. The right plot shows the corresponding modularity index due to visual features in each layer of ResNet. The visual features at the middle layers give rise to the highest modularity. **(b)** 18 example visual features at the 31^st layer are visualized in pixel space. Each visual feature shows 4 exemplars that maximize the feature representation. **(c)** The correlation between the inter-category cortical similarity across layers and the inter-category semantic similarity (with three different measures, i.e. the LCH similarity, the word2vec similarity, and the GloVe similarity) is shown for each layer in ResNet.

**Figure 7**
Categorical similarity and clustering in cortical representation within superordinate-level categories. **(a)** Fine-scale cortical areas specific to each superordinate-level category: biological objects (red), background scenes (green) and non-biological objects (blue). **(b)** The cortical similarity between categories in fine-scale cortical representation. The categories in each sub-cluster were displayed on the right. See individual results in Supplementary Fig. S2.

**Figure 8**
Contribution of layer-wise visual features to the similarity and modularity in cortical representations within superordinate-level categories. The left shows the similarity between categories in fine-scale cortical representations that are contributed by separated category information from individual layers. The order of categories is the same as in Fig. 7. The right plot shows the modularity index across all layers. The highest-layer visual features show the highest modularity for biological objects.

See this image and copyright information in PMC

Cited by

Computational models of category-selective brain regions enable high-throughput tests of selectivity.
Ratan Murty NA, Bashivan P, Abate A, DiCarlo JJ, Kanwisher N. Ratan Murty NA, et al. Nat Commun. 2021 Sep 20;12(1):5540. doi: 10.1038/s41467-021-25409-6. Nat Commun. 2021. PMID: 34545079 Free PMC article.
Informed feature regularization in voxelwise modeling for naturalistic fMRI experiments.
Yılmaz Ö, Çelik E, Çukur T. Yılmaz Ö, et al. Eur J Neurosci. 2020 Sep;52(5):3394-3410. doi: 10.1111/ejn.14760. Epub 2020 May 22. Eur J Neurosci. 2020. PMID: 32343012 Free PMC article.
Methods for computing the maximum performance of computational models of fMRI responses.
Lage-Castellanos A, Valente G, Formisano E, De Martino F. Lage-Castellanos A, et al. PLoS Comput Biol. 2019 Mar 8;15(3):e1006397. doi: 10.1371/journal.pcbi.1006397. eCollection 2019 Mar. PLoS Comput Biol. 2019. PMID: 30849071 Free PMC article.
Compression-enabled interpretability of voxelwise encoding models.
Kamali F, Suratgar AA, Menhaj M, Abbasi-Asl R. Kamali F, et al. PLoS Comput Biol. 2025 Feb 19;21(2):e1012822. doi: 10.1371/journal.pcbi.1012822. eCollection 2025 Feb. PLoS Comput Biol. 2025. PMID: 39970189 Free PMC article.
Self-supervised Natural Image Reconstruction and Large-scale Semantic Classification from Brain Activity.
Gaziv G, Beliy R, Granot N, Hoogi A, Strappini F, Golan T, Irani M. Gaziv G, et al. Neuroimage. 2022 Jul 1;254:119121. doi: 10.1016/j.neuroimage.2022.119121. Epub 2022 Mar 24. Neuroimage. 2022. PMID: 35342004 Free PMC article.

See all "Cited by" articles

References

1. DiCarlo JJ, Cox DD. Untangling invariant object recognition. Trends in cognitive sciences. 2007;11:333–341. doi: 10.1016/j.tics.2007.06.010. - DOI - PubMed
1. Thorpe S, Fize D, Marlot C. Speed of processing in the human visual system. nature. 1996;381:520. doi: 10.1038/381520a0. - DOI - PubMed
1. Van Essen DC, Anderson CH, Felleman DJ. Information processing in the primate visual system: an integrated systems perspective. Science. 1992;255:419. doi: 10.1126/science.1734518. - DOI - PubMed
1. Yamins DL, DiCarlo JJ. Using goal-driven deep learning models to understand sensory cortex. Nature neuroscience. 2016;19:356–365. doi: 10.1038/nn.4244. - DOI - PubMed
1. Grill-Spector K, Weiner KS. The functional architecture of the ventral temporal cortex and its role in categorization. Nature Reviews Neuroscience. 2014;15:536–548. doi: 10.1038/nrn3747. - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Deep Residual Network Predicts Cortical Representation and Organization of Visual Features for Rapid Categorization

Affiliations

Deep Residual Network Predicts Cortical Representation and Organization of Visual Features for Rapid Categorization

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources