Human Visual Pathways for Action Recognition versus Deep Convolutional Neural Networks: Representation Correspondence in Late but Not Early Layers

Yujia Peng^{1

2

3

4}, Xizi Gong¹, Hongjing Lu^{4

5}, Fang Fang^{1

6

7

8}

Affiliations

¹ School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, People's Republic of China.
² Institute for Artificial Intelligence, Peking University, Beijing, People's Republic of China.
³ National Key Laboratory of General Artificial Intelligence, Beijing Institute for General Artificial Intelligence, Beijing, China.
⁴ Department of Psychology, University of California, Los Angeles.
⁵ Department of Statistics, University of California, Los Angeles.
⁶ IDG/McGovern Institute for Brain Research, Peking University, Beijing, People's Republic of China.
⁷ Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, People's Republic of China.
⁸ Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, People's Republic of China.

PMID: 39106158
DOI: 10.1162/jocn_a_02233

Human Visual Pathways for Action Recognition versus Deep Convolutional Neural Networks: Representation Correspondence in Late but Not Early Layers

Yujia Peng et al. J Cogn Neurosci. 2024.

. 2024 Nov 1;36(11):2458-2480.

doi: 10.1162/jocn_a_02233.

Authors

Yujia Peng^{1

2

3

4}, Xizi Gong¹, Hongjing Lu^{4

5}, Fang Fang^{1

6

7

8}

Affiliations

¹ School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, People's Republic of China.
² Institute for Artificial Intelligence, Peking University, Beijing, People's Republic of China.
³ National Key Laboratory of General Artificial Intelligence, Beijing Institute for General Artificial Intelligence, Beijing, China.
⁴ Department of Psychology, University of California, Los Angeles.
⁵ Department of Statistics, University of California, Los Angeles.
⁶ IDG/McGovern Institute for Brain Research, Peking University, Beijing, People's Republic of China.
⁷ Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, People's Republic of China.
⁸ Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, People's Republic of China.

PMID: 39106158
DOI: 10.1162/jocn_a_02233

Abstract

Deep convolutional neural networks (DCNNs) have attained human-level performance for object categorization and exhibited representation alignment between network layers and brain regions. Does such representation alignment naturally extend to other visual tasks beyond recognizing objects in static images? In this study, we expanded the exploration to the recognition of human actions from videos and assessed the representation capabilities and alignment of two-stream DCNNs in comparison with brain regions situated along ventral and dorsal pathways. Using decoding analysis and representational similarity analysis, we show that DCNN models do not show hierarchical representation alignment to human brain across visual regions when processing action videos. Instead, later layers of DCNN models demonstrate greater representation similarities to the human visual cortex. These findings were revealed for two display formats: photorealistic avatars with full-body information and simplified stimuli in the point-light display. The discrepancies in representation alignment suggest fundamental differences in how DCNNs and the human brain represent dynamic visual information related to actions.

PubMed Disclaimer

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Silverchair Information Systems

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Human Visual Pathways for Action Recognition versus Deep Convolutional Neural Networks: Representation Correspondence in Late but Not Early Layers

Affiliations

Human Visual Pathways for Action Recognition versus Deep Convolutional Neural Networks: Representation Correspondence in Late but Not Early Layers

Authors

Affiliations

Abstract

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources