Review

. 2025 Aug;292(2053):20250602.

doi: 10.1098/rspb.2025.0602. Epub 2025 Aug 20.

Characterizing internal models of the visual environment

Micha Engeser^#^{1

2}, Susan Ajith^#¹, Ilker Duymaz¹, Gongting Wang^{1

3}, Matthew J Foxwell⁴, Radoslaw M Cichy³, David Pitcher⁴, Daniel Kaiser^{1

2

5}

Affiliations

¹ Department of Mathematics and Computer Science, Physics, Geography, Justus Liebig University Giessen, Giessen, HE, Germany.
² Center for Mind, Brain and Behavior, Universities of Giessen, Marburg, and Darmstadt, Marburg, HE, Germany.
³ Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany.
⁴ Department of Psychology, University of York, York, UK.
⁵ Cluster of Excellence "The Adaptive Mind", Universities of Giessen, Marburg, and Darmstadt, Giessen, HE, Germany.

^# Contributed equally.

PMID: 40829663
PMCID: PMC12364571
DOI: 10.1098/rspb.2025.0602

Review

Characterizing internal models of the visual environment

Micha Engeser et al. Proc Biol Sci. 2025 Aug.

. 2025 Aug;292(2053):20250602.

doi: 10.1098/rspb.2025.0602. Epub 2025 Aug 20.

Authors

Micha Engeser^#^{1

2}, Susan Ajith^#¹, Ilker Duymaz¹, Gongting Wang^{1

3}, Matthew J Foxwell⁴, Radoslaw M Cichy³, David Pitcher⁴, Daniel Kaiser^{1

2

5}

Affiliations

¹ Department of Mathematics and Computer Science, Physics, Geography, Justus Liebig University Giessen, Giessen, HE, Germany.
² Center for Mind, Brain and Behavior, Universities of Giessen, Marburg, and Darmstadt, Marburg, HE, Germany.
³ Department of Education and Psychology, Freie Universität Berlin, Berlin, Germany.
⁴ Department of Psychology, University of York, York, UK.
⁵ Cluster of Excellence "The Adaptive Mind", Universities of Giessen, Marburg, and Darmstadt, Giessen, HE, Germany.

^# Contributed equally.

PMID: 40829663
PMCID: PMC12364571
DOI: 10.1098/rspb.2025.0602

Abstract

Despite the complexity of real-world environments, natural vision is seamlessly efficient. To explain this efficiency, researchers often use predictive processing frameworks, in which perceptual efficiency is determined by the match between the visual input and internal models of what the world should look like. In scene vision, predictions derived from our internal models of a scene should play a particularly important role, given the highly reliable statistical structure of our environment. Despite their importance for scene perception, we still do not fully understand what is contained in our internal models of the environment. Here, we highlight that the current literature disproportionately focuses on an experimental approach that tries to infer the contents of internal models from arbitrary, experimenter-driven manipulations in stimulus characteristics. To make progress, additional participant-driven approaches are needed, focusing on participants' descriptions of what constitutes a typical scene. We discuss how recent studies on memory and perception used methods like line drawings to characterize internal representations in unconstrained ways and on the level of individual participants. These emerging methods show that it is now time to also study natural scene perception from a different angle-starting with a characterization of an individual's expectations about the world.

Keywords: drawings; individual differences; internal models; predictive processing; scene representation; visual perception.

PubMed Disclaimer

Conflict of interest statement

We declare we have no competing interests.

Figures

**Figure 1.**
Understanding internal models through stimulus manipulation. To infer the contents of internal models of the world, researchers have manipulated the real-world typicality of visual inputs on different levels. From left to right: manipulations in the typical positioning of individual objects across visual space [23], the typical composition of multiple objects across space (figure is reproduced from [24]), the semantic consistency between scenes and the objects they contain [25] and the structural coherence of the scene [26]. By comparing typically arranged stimuli with atypically arranged stimuli, such studies show that the visual system preferentially processes stimuli that are in accordance with our priors.

**Figure 2.**
A complementary approach for studying internal models of the world. The classical approach aims at discovering properties of internal models through stimulus manipulation, for instance by manipulating a scene’s global structure. Here, we highlight a complementary approach, in which the contents of internal models are described by observers, for instance through line drawing (where people draw typical versions of scenes) or scene arrangement methods (where people arrange physical or virtual scenes in typical ways). These descriptions can, in turn, be used to derive targeted predictions about processing efficiency for a set of inputs.

Using drawings to describe representations in development, memory, and perception. — **Figure 3.**
Using drawings to describe representations in development, memory and perception. (a) In developmental research, the use of drawings allows researchers to gain insights into the emergence of detailed visual object representations (figure is reproduced from [62] under a CC BY 4.0 license). (b) In memory research, drawings can be used to quantify memory precision in free recall paradigms. For instance, in scenes with inconsistent objects, more detail about the inconsistent object is recalled, at the expense of recalling details of the scene (figure reproduced from [63]). (c) Using a similar free recall paradigm, Bainbridge & Baker [64] showed that scene boundaries are extended or compressed in memory, depending on the viewpoint and geometry of the original scene. (d) In perception research, drawings were used to probe the cortical filling-in of missing information. Participants’ drawings of what should be present in an occluded quadrant predict neural activation: response patterns in areas of primary visual cortex (V1) that respond to the occluded quadrant are well explained by visual low-level visual features of these drawings [65].

**Figure 4.**
Using drawings to link individual differences in internal models to idiosyncrasies in perception. (a) To assess the contents of internal models for real-world scenes, participants drew typical versions of scene categories (here: living rooms). (b) These drawings were converted to three-dimensional renders to control for visual differences. (c) During the subsequent categorization task, participants categorized briefly presented renders. Critically, they viewed renders based on their own drawings (‘own’ condition), other participants’ drawings (‘other’ condition) or renders created from scenes participants previously copied from a photograph (‘control’ condition, designed to control for drawing-related familiarity effects). Across two experiments with two (left) or six (right) scene categories, participants more accurately categorized renders from the ‘own’ condition than renders from the ‘other’ or ‘control’ conditions, suggesting that similarity to internal models on the individual level modulates scene processing in idiosyncratic ways [68].

**Figure 5.**
Using explicit scene arrangement to describe internal models. (a) Children of different age groups were asked to arrange a set of miniature objects across a dollhouse. Object arrangements showed that children first appreciate semantic object similarities and only later incorporate the typical spatial organization across groups of objects [77]. (b) Participants arranged objects in a VR environment into typical or atypical configurations. In subsequent search and memory tasks, participants performed better when the task was situated in the scenes constructed in a typical fashion (figure reproduced from [78] under a CC BY 4.0 license).

See this image and copyright information in PMC

References

1. Clark A. 2013. Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behav. Brain Sci. 36, 181–204. ( 10.1017/S0140525X12000477) - DOI - PubMed
1. de Lange FP, Heilbron M, Kok P. 2018. How do expectations shape perception? Trends Cogn. Sci. 22, 764–779. ( 10.1016/j.tics.2018.06.002) - DOI - PubMed
1. Kayser C, Körding KP, König P. 2004. Processing of complex stimuli and natural scenes in the visual cortex. Curr. Opin. Neurobiol. 14, 468–473. ( 10.1016/j.conb.2004.06.002) - DOI - PubMed
1. Mirza MB, Adams RA, Mathys CD, Friston KJ. 2016. Scene construction, visual foraging, and active inference. Front. Comput. Neurosci. 10, 56. ( 10.3389/fncom.2016.00056) - DOI - PMC - PubMed
1. Bar M. 2004. Visual objects in context. Nat. Rev. Neurosci. 5, 617–629. ( 10.1038/nrn1476) - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Atypon
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Characterizing internal models of the visual environment

Affiliations

Characterizing internal models of the visual environment

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources