Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan:3:13-26.
doi: 10.1038/s44159-023-00254-0. Epub 2023 Nov 23.

Predictive processing of scenes and objects

Affiliations

Predictive processing of scenes and objects

Marius V Peelen et al. Nat Rev Psychol. 2024 Jan.

Abstract

Real-world visual input consists of rich scenes that are meaningfully composed of multiple objects which interact in complex, but predictable, ways. Despite this complexity, we recognize scenes, and objects within these scenes, from a brief glance at an image. In this review, we synthesize recent behavioral and neural findings that elucidate the mechanisms underlying this impressive ability. First, we review evidence that visual object and scene processing is partly implemented in parallel, allowing for a rapid initial gist of both objects and scenes concurrently. Next, we discuss recent evidence for bidirectional interactions between object and scene processing, with scene information modulating the visual processing of objects, and object information modulating the visual processing of scenes. Finally, we review evidence that objects also combine with each other to form object constellations, modulating the processing of individual objects within the object pathway. Altogether, these findings can be understood by conceptualizing object and scene perception as the outcome of a joint probabilistic inference, in which "best guesses" about objects act as priors for scene perception and vice versa, in order to concurrently optimize visual inference of objects and scenes.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1. Scene- and object-selective regions in human visual cortex and their relation to the center-periphery organization.
a) Medial (left) and lateral (right) views of scene- and object-selective regions in the human visual cortex. Scene-selective regions include MPA (medial place area), OPA (occipital place area), and PPA (parahippocampal place area). Object-selective regions include pFs (posterior fusiform gyrus) and LO (lateral occipital cortex). EVC: early visual cortex (adapted from). b) Ventral view of scene- and object-selective regions and their relation to the center-periphery organization. The scene-selective PPA is biased towards peripheral visual input, while the object-selective pFs and LO are biased towards central visual input (adapted from,).
Fig. 2
Fig. 2. Bidirectional interactions between objects and scenes.
a) Scene context can shape object perception, particularly when an object is ambiguous (“object blurry”, top row). In this example, the ambiguous object is perceived either as a car or as a printer depending on the scene context. However, when the object is sharp (bottom row), it can be recognized based on local features alone, reducing the influence of context. In that scenario, an incongruent object (a printer on a road) is surprising and receives more attention. b) Objects can also shape scene perception, following similar principles. Here, the ambiguous scene (top row) is perceived as an outdoor (open) or indoor (closed) space depending on the objects. When the scene is sharp, the object influence is reduced (bottom row).
Fig. 3
Fig. 3. Scene context can both facilitate and impair object perception.
a) Paradigm to investigate the perceived sharpness of objects in scenes. Participants adjusted the blur level of a sample object (the car) to match that of a target object. b) More blur was added to objects when they were viewed within a coherent scene context, indicating that those objects were initially perceived as sharper. Congruent scene context thus facilitates object perception. c) Paradigm used to investigate the perception of unambiguous objects as a function of semantic congruency. After viewing a scene for 2.5 s, participants had to indicate which of the two exemplars had been presented in the scene. Key objects could be congruent or incongruent with the scene (top row). To control for general effects of congruence, control objects (Other) were also tested (bottom row). Other objects were always congruent with the scene but were presented in scenes that also contained congruent or incongruent Key object. d) Results showed a congruency cost, such that participants were more accurate at recognizing objects that were incongruent with the scene. No effect of congruency was found for the control objects. In this case, congruent scene context impaired object perception. Error bars show 95% confidence intervals.
Fig. 4
Fig. 4. Neural evidence for bidirectional interactions between object and scene processing in the visual cortex.
a) To test whether scene context modulates representations in the visual cortex, participants viewed ambiguous objects with (“Scene-disambiguated object”) and without (“Ambiguous object”) scene context while brain activity was measured using fMRI. Multivariate activity patterns in the object-selective visual cortex in response to the ambiguous objects were classified as animate vs inanimate categories based on activity patterns evoked by clearly visible objects (illustrated by the picture in inset), presented in a separate experimental run. Results showed that the presence of scene context increased decoding accuracy (i.e., third bar higher than second bar). These results may reflect a neural correlate of the perceptual sharpening illustrated in Fig. 3ab. b) Similar effects were observed for the reverse influence, with objects modulating scene representations in the scene-selective cortex. In this case, an object disambiguated the scene, such that response patterns evoked by ambiguous scenes became more similar to clearly visible scenes presented in a separate experimental run. Error bars indicate standard error of the mean.
Fig. 5
Fig. 5. Object constellations.
a) Objects are often seen together with other objects in familiar spatial arrangements, as illustrated here for a living room set. Neuroimaging studies have provided evidence for integrative representations of regular object arrangements in the ventral visual cortex,. b) Example stimuli used to test whether regular object arrangements are detected more quickly than irregular object arrangements in a breaking continuous flash suppression experiment. Inverting the objects serves as a control for possible low-level stimulus differences. c) Results showing that regular object displays broke suppression (i.e., were detected more quickly) than irregular displays. No such effect was found for inverted controls. Error bars show 95% confidence intervals of the mean difference between regular and irregular conditions.
Fig. 6
Fig. 6. An integrated model of object-scene and object-object interactions.
a) General overview of visual processing, from initial perceptual processing (bottom) to semantic-level representations of scene schemas (top). Object processing is primarily informed by foveal (local) input, while scene processing is primarily informed by peripheral (global) input. Each processing pathway is hierarchically organized, with feedforward and feedback connections, indicated by red and blue arrows, respectively. Both pathways project to, and receive information from, higher-order regions containing semantic information of object-scene schemas (mental models in long-term memory). b) Schematic illustration of proposed interactions within and between object and scene processing pathways. Hierarchical organization of each pathway is indicated with circles, which contain low-, mid-, and high-level features of object and scene processing. Cross-pathway interactions may be most effective at higher levels of the hierarchy, but may also exist at lower levels (not illustrated). Subsequent feedback within each pathway can result in modulations at lower levels of the processing hierarchy and result in perceptual sharpening.

References

    1. Barlow HB. The knowledge used in vision and where it comes from. Phil Trans R Soc Lond B. 1997;352:1141–1147. - PMC - PubMed
    1. Oliva A, Torralba A. The role of context in object recognition. Trends in Cognitive Sciences. 2007;11:520–527. - PubMed
    1. Bar M. Visual objects in context. Nat Rev Neurosci. 2004;5:617–629. - PubMed
    1. Purves D, Wojtach WT, Lotto RB. Understanding vision in wholly empirical terms. Proc Natl Acad Sci USA. 2011;108:15588–15595. - PMC - PubMed
    1. Simoncelli EP, Olshausen BA. Natural Image Statistics and Neural Representation. Annu Rev Neurosci. 2001;24:1193–1216. - PubMed