Fast and robust visual object recognition in young children

Vladislav Ayzenberg^{1

2}, Sukran Bahar Sener³, Kylee Novick⁴, Stella F Lourenco⁴

Affiliations

¹ Department of Psychology and Neuroscience, Temple University, Philadelphia, PA, USA.
² Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA.
³ Department of Psychology, University of Washington, Seattle, WA, USA.
⁴ Department of Psychology, Emory University, Atlanta, GA, USA.

PMID: 40601725
PMCID: PMC12219494
DOI: 10.1126/sciadv.ads6821

Fast and robust visual object recognition in young children

Vladislav Ayzenberg et al. Sci Adv. 2025.

. 2025 Jul 4;11(27):eads6821.

doi: 10.1126/sciadv.ads6821. Epub 2025 Jul 2.

Authors

Vladislav Ayzenberg^{1

2}, Sukran Bahar Sener³, Kylee Novick⁴, Stella F Lourenco⁴

Affiliations

¹ Department of Psychology and Neuroscience, Temple University, Philadelphia, PA, USA.
² Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA.
³ Department of Psychology, University of Washington, Seattle, WA, USA.
⁴ Department of Psychology, Emory University, Atlanta, GA, USA.

PMID: 40601725
PMCID: PMC12219494
DOI: 10.1126/sciadv.ads6821

Abstract

By adulthood, humans rapidly identify objects from sparse visual displays and across large disruptions to their appearance. What are the minimal conditions needed to achieve robust recognition abilities and when might these abilities develop? To answer these questions, we investigated the upper limits of children's object recognition abilities. We found that children as young as 3 years successfully identified objects at speeds of 100 milliseconds (both forward and backward masked) under sparse and disrupted viewing conditions. By contrast, a range of computational models implemented with biologically informed properties or optimized for visual recognition did not reach child-level performance. Models only matched children if they received more object examples than children are capable of experiencing. These findings highlight the robustness of the human visual system in the absence of extensive experience and identify important developmental constraints for building biologically plausible machines.

PubMed Disclaimer

Figures

**Fig. 1.. Stimuli and human testing procedure.**
(A) Children and adults were tested with object outlines that had either complete, perturbed, or deleted contours. (B) On each trial, participants were presented with an object image rapidly (100- to 300-ms duration), which was both forward and backward masked. In the prompt phase, child participants were asked to verbally indicate which object they saw among two possibilities (read by an experimenter). Adult participants responded by pressing an arrow key that corresponded to each object label.

**Fig. 2.. Children’s performance for each condition.**
Across age, children performed above chance for each condition at each duration. Error bars depict 95% confidence intervals. The dotted black line indicates chance performance (0.50).

**Fig. 3.. Performance under each condition by age group.**
(A) Under the complete condition, participants of all ages performed above chance, even at the fastest speeds. (B) Under the perturbed condition, 4 and 5 year olds performed above chance at all speeds, whereas 3 year olds were only above chance when durations were 200 ms and slower. (C) Under the deleted condition, 4 and 5 year olds performed above chance at all speeds, whereas 3 year olds only performed above chance at the slowest speeds (250 and 300 ms). Error bars depict 95% confidence intervals. The dotted black line indicates chance performance (0.50).

**Fig. 4.. Influence of low-level shape features.**
Performance separated by (A and B) curvature and (C and D) shape envelope similarity across different [(A) and (C)] stimulus durations and [(B) and (D)] age groups. The black dotted line indicates chance performance (0.50) Error bars depict 95% confidence intervals.

**Fig. 5.. Model and human performance under each condition.**
Performance of models and humans under the (top) complete, (middle) perturbed, and (bottom) deleted contour conditions. Human data for each age (red dotted lines: children; gray dotted lines: adults) were aggregated into fast (100 and 150 ms) and slow (200 and 250 ms) stimulus durations. Humans were compared to (A to C) biologically inspired (blue: ventral-like architecture; green: trained on child experience) and performance-optimized (orange: classification objective; violet: unsupervised and vision-language objective) models and (D to F) models selected to disambiguate between the contributions of training type, scale, and learning objective (yellow: classification objective; purple: vision-language objective). The y axis indicates classification accuracy. The black dotted line indicates chance performance (0.5). Error bars depict 95% confidence intervals for models. See fig. S3 and tables S2 to S4 for variability estimates and confidence intervals for human data.

**Fig. 6.. Recognition performance as a function of experience.**
Scatter plots showing the relation between classification accuracy on the y axis for (A) complete, (B) perturbed, and (C) deleted contour conditions and total number of images models were trained with on the x axis in the log scale. Human estimates are plotted as stars, and their experience is conservatively estimated as seeing one object every second of their life without sleep.

See this image and copyright information in PMC

References

1. Grill-Spector K., Kanwisher N., Visual recognition: As soon as you know it is there, you know what it is. Psychol. Sci. 16, 152–160 (2005). - PubMed
1. Biederman I., Bar M., One-shot viewpoint invariance in matching novel objects. Vision Res. 39, 2885–2899 (1999). - PubMed
1. Murray R. F., Sekuler A. B., Bennett P. J., Time course of amodal completion revealed by a shape discrimination task. Psychon. Bull. Rev. 8, 713–720 (2001). - PubMed
1. Wagemans J., De Winter J., de Beeck H. O., Ploeger A., Beckers T., Vanroose P., Identification of everyday objects on the basis of silhouette and outline versions. Perception 37, 207–244 (2008). - PubMed
1. Biederman I., Cooper E. E., Priming contour-deleted images: Evidence for intermediate representations in visual object recognition. Cog. Psychol. 23, 393–419 (1991). - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Fast and robust visual object recognition in young children

Affiliations

Fast and robust visual object recognition in young children

Authors

Affiliations

Abstract

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources