What Do Visual Neural Networks Learn?

Daniella Har-Shalom^{1

2}, Yair Weiss^{1

2}

Affiliations

¹ School of Computer Science and Engineering, Hebrew University of Jerusalem, Jerusalem, Israel; email: yweiss@cs.huji.ac.il.
² Edmond and Lily Safra Center for Brain Sciences, Hebrew University of Jerusalem, Jerusalem, Israel.

PMID: 40729617
DOI: 10.1146/annurev-vision-110323-112903

Free article

Review

What Do Visual Neural Networks Learn?

Daniella Har-Shalom et al. Annu Rev Vis Sci. 2025 Sep.

Free article

. 2025 Sep;11(1):591-610.

doi: 10.1146/annurev-vision-110323-112903. Epub 2025 Jul 29.

Authors

Daniella Har-Shalom^{1

2}, Yair Weiss^{1

2}

Affiliations

¹ School of Computer Science and Engineering, Hebrew University of Jerusalem, Jerusalem, Israel; email: yweiss@cs.huji.ac.il.
² Edmond and Lily Safra Center for Brain Sciences, Hebrew University of Jerusalem, Jerusalem, Israel.

PMID: 40729617
DOI: 10.1146/annurev-vision-110323-112903

Abstract

Over the past decade, artificial neural networks trained to classify images downloaded from the internet have achieved astounding, almost superhuman performance and have been suggested as possible models for human vision. In this article, we review experimental evidence from multiple studies elucidating the classification strategy learned by successful visual neural networks (VNNs) and how this strategy may be related to human vision as well as previous approaches to computer vision. The studies we review evaluate the performance of VNNs on carefully designed tasks that are meant to tease out the cues they use. The use of this method shows that VNNs are often fooled by image changes to which human object recognition is largely invariant (e.g., the change of a few pixels in the image or a change of the background or illumination), and, conversely, that the networks can be invariant to very large image manipulations that disrupt human performance (e.g., randomly permuting the patches of an image). Taken together, the evidence suggests that these networks have learned relatively low-level cues that are extremely effective at classifying internet images but are ineffective at classifying many other images that humans can classify effortlessly.

Keywords: computer vision; convolutional neural networks; robustness; vision transformers.

PubMed Disclaimer

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Ingenta plc
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

What Do Visual Neural Networks Learn?

Affiliations

What Do Visual Neural Networks Learn?

Authors

Affiliations

Abstract

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Miscellaneous