Randomized Controlled Trial

. 2009 Mar;58(2):137-76.

doi: 10.1016/j.cogpsych.2008.06.001. Epub 2008 Aug 30.

Recognition of natural scenes from global properties: seeing the forest without representing the trees

Michelle R Greene¹, Aude Oliva

Affiliations

Affiliation

¹ Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, 77 Massachusetts Avenue 46-4078, Cambridge, MA 02139, USA. mrgreene@mit.edu

PMID: 18762289
PMCID: PMC2759758
DOI: 10.1016/j.cogpsych.2008.06.001

Randomized Controlled Trial

Recognition of natural scenes from global properties: seeing the forest without representing the trees

Michelle R Greene et al. Cogn Psychol. 2009 Mar.

. 2009 Mar;58(2):137-76.

doi: 10.1016/j.cogpsych.2008.06.001. Epub 2008 Aug 30.

Authors

Michelle R Greene¹, Aude Oliva

Affiliation

¹ Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, 77 Massachusetts Avenue 46-4078, Cambridge, MA 02139, USA. mrgreene@mit.edu

PMID: 18762289
PMCID: PMC2759758
DOI: 10.1016/j.cogpsych.2008.06.001

Abstract

Human observers are able to rapidly and accurately categorize natural scenes, but the representation mediating this feat is still unknown. Here we propose a framework of rapid scene categorization that does not segment a scene into objects and instead uses a vocabulary of global, ecological properties that describe spatial and functional aspects of scene space (such as navigability or mean depth). In Experiment 1, we obtained ground truth rankings on global properties for use in Experiments 2-4. To what extent do human observers use global property information when rapidly categorizing natural scenes? In Experiment 2, we found that global property resemblance was a strong predictor of both false alarm rates and reaction times in a rapid scene categorization experiment. To what extent is global property information alone a sufficient predictor of rapid natural scene categorization? In Experiment 3, we found that the performance of a classifier representing only these properties is indistinguishable from human performance in a rapid scene categorization task in terms of both accuracy and false alarms. To what extent is this high predictability unique to a global property representation? In Experiment 4, we compared two models that represent scene object information to human categorization performance and found that these models had lower fidelity at representing the patterns of performance than the global property model. These results provide support for the hypothesis that rapid categorization of natural scenes may not be mediated primarily though objects and parts, but also through global properties of structure and affordance.

PubMed Disclaimer

Figures

**Figure 1**
A schematic illustration of the hierarchical grouping task of Experiment 1. Here, a ranking along the global property *temperature* is portrayed. (a) The images are divided into two groups with the “colder” scenes on the left and the “warmer” scenes on the right. (b) Finer rankings are created by dividing the two initial groups into two subgroups. (c) Images in each quadrant are again divided into two subgroups to create a total of eight groups, ranked from the “coldest” scenes to the “hottest” scenes.

**Figure 2**
Examples of scenes with low, medium and high rankings from Experiment 1 along each global property.

**Figure 3**
Box-and-whisker plots of global property rankings for each semantic category, calculated from the ranking data in Experiment 1. Properties are, right to left, C, concealment; Tr, transience; N, navigability; Te, temperature; O, openness; E, expansion and Md, mean depth. Lines indicate median rankings, boxes indicate quartiles and whiskers indicate range. Significant outlier images are shown as crosses.

**Figure 4**
Illustration of human performance along different distractor sets in Experiment 2. Distractor sets that share a global property with the target category (*closed* is a property of forests and *open* is a property of fields) yield more false alarms than distractor sets that do not. Representative numbers taken from meta-observers' data.

**Figure 5**
Categorization performance (percent correct) of naïve Bayes classifier in Experiment 3 is well-correlated with human rapid categorization performance from Experiment 2 (meta-observer data).

**Figure 6**
Examples of human and model performances. (A) (bold titles) corresponds to the correct responses made by both humans (Experiment 2) and the global property classifier (Experiment 3) for the above scene pictures. The other rows (with titles in quotes) represent categorization errors made, respectively, by both humans and the model (B); by the model only (C); by the humans only (D), for the respective scene pictures.

**Figure 7**
A) Classifier's performance in Experiment 3 when trained with incomplete data, using from 1 to 7 global properties. The classifier can perform above chance with only one global property (30%), and performance linearly increases with additional properties. Chance level is indicated with the dotted line. (B) Mean classifier performance when trained with incomplete data that contained a particular global property. Classifier performed similarly when any particular global property was present.

**Figure 8**
Examples of non-prototypical images. Human observers ranked the images according to their prototypicality along one or more categories (Appendix A.3). For all examples (H) indicates the order of prototypicality given by the human observers and (C) is the order of classification given by the global property classifier. Although the classifier rates the probability of the image being in each category, we show only the top choices for the same number of categories ranked by the human observers. In other words, if the human observers gave prototypicality rankings for two categories, we show the top two choices of the classifier.

**Fig. 9**
Examples of segmentations and annotations made using the LabelMe annotation tool, and used as the basis for the local scene representation in Experiment 4.

**Figure 10**
Examples of false alarms made by the global property classifier of Experiment 3 and the local semantic concept classifier of Experiment 4. Underneath, we report the percent of human false alarms made on that image. The global property classifier captures the majority of false alarms made by human observers while the local semantic concept classifier captures less (see Table 5).

See this image and copyright information in PMC

References

1. Alvarez GA, Oliva A. The representation of simple ensemble visual features outside the focus of attention. Psychological Science. 2008;19(4):392–398. - PMC - PubMed
1. Appelton J. The experience of landscape. Wiley; London: 1975.
1. Ariely D. Seeing sets: Representation by statistical properties. Psychological Science. 2001;12:157–162. - PubMed
1. Ashby F, Lee W. Predicting similarity and categorization from identification. Journal of Experimental Psychology: General. 1991;120(2):150–172. - PubMed
1. Bar M. Visual objects in context. Nature Reviews: Neuroscience. 2004;5:617–629. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

R03 MH068322/MH/NIMH NIH HHS/United States

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Recognition of natural scenes from global properties: seeing the forest without representing the trees

Affiliation

Recognition of natural scenes from global properties: seeing the forest without representing the trees

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources