Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jun 5;14(7):4.
doi: 10.1167/14.7.4.

Modeling visual clutter perception using proto-object segmentation

Affiliations

Modeling visual clutter perception using proto-object segmentation

Chen-Ping Yu et al. J Vis. .

Abstract

We introduce the proto-object model of visual clutter perception. This unsupervised model segments an image into superpixels, then merges neighboring superpixels that share a common color cluster to obtain proto-objects-defined here as spatially extended regions of coherent features. Clutter is estimated by simply counting the number of proto-objects. We tested this model using 90 images of realistic scenes that were ranked by observers from least to most cluttered. Comparing this behaviorally obtained ranking to a ranking based on the model clutter estimates, we found a significant correlation between the two (Spearman's ρ = 0.814, p < 0.001). We also found that the proto-object model was highly robust to changes in its parameters and was generalizable to unseen images. We compared the proto-object model to six other models of clutter perception and demonstrated that it outperformed each, in some cases dramatically. Importantly, we also showed that the proto-object model was a better predictor of clutter perception than an actual count of the number of objects in the scenes, suggesting that the set size of a scene may be better described by proto-objects than objects. We conclude that the success of the proto-object model is due in part to its use of an intermediate level of visual representation-one between features and objects-and that this is evidence for the potential importance of a proto-object representation in many common visual percepts and tasks.

Keywords: color clustering; image segmentation; midlevel visual representation; proto-objects; superpixel merging; visual clutter.

PubMed Disclaimer

Figures

Figure 1
Figure 1
What is the set size of these scenes? Although quantifying the number of objects in realistic scenes may be an ill-posed problem, can you make relative clutter judgments between these scenes?
Figure 2
Figure 2
Left: one of the images used in this study. Right, top row: a SLIC superpixel segmentation using 200 (left) and 1,000 (right) seeds. Right, bottom row: an entropy rate superpixel segmentation using 200 (left) and 1,000 (right) seeds. Notice that the superpixels generated by SLIC are more compact and regular, whereas those generated by the entropy rate method have greater boundary adherence but are less regular.
Figure 3
Figure 3
The computational procedure illustrated for a representative scene. Top row (left to right): a SLIC superpixel segmentation using k = 600 seeds; 51 clusters of median superpixel color using mean-shift (bandwidth = 4) in HSV color space; 209 proto-objects obtained after merging, normalized visual clutter score = 0.345; a visualization of the proto-object segmentation showing each proto-object filled with the median color from the corresponding pixels in the original image. Bottom row (left to right): an entropy rate superpixel segmentation using k = 600 seeds; 47 clusters of median superpixel color using mean-shift (bandwidth = 4) in HSV color space; 281 proto-objects obtained after merging, normalized visual clutter score = 0.468; a visualization of the proto-object segmentation showing each proto-object filled with the median color from the corresponding pixels in the original image.
Figure 4
Figure 4
Object segmentations from human observers for 4 of the 90 scenes used in this study. Segmentations were provided as part of the SUN09 image collection. To the right of each are lists of the object segment labels (object counts), in matching colors. Top left: three objects. Top right: 17 objects. Bottom left: 48 objects. Bottom right: 57 objects.
Figure 5
Figure 5
Representative examples of 3 of the 90 images used in this study (left column), shown with their corresponding proto-object segmentations (middle column) and reconstructions created by filling each proto-object with its median color (right column). Top row: Clutter score = 0.430 (ranked 41st). Middle row: Clutter score = 0.540 (ranked 65th). Bottom row: Clutter score = 0.692 (ranked 87th). Corresponding rankings from the behavioral participants were: 38th, 73rd, and 89th, respectively. Proto-object model simulations were based on entropy rate superpixel segmentations (Liu et al., 2011) using 600 initial seeds and a mean-shift clustering bandwidth of four within an HSV color feature space.
Figure 6
Figure 6
Clutter ranking of the 90 test scenes by the proto-object model plotted as a function of the median clutter ranking by our 15 behavioral participants for the same 90 scenes. Spearman's ρ = 0.814.
Figure 7
Figure 7
Spearman's correlations between the behaviorally obtained clutter ratings and ratings obtained from eight methods of predicting clutter, shown ordered from highest (left) to lowest (right). PO: Our proto-object clutter model. MS: Mean-shift image segmentation (Comaniciu & Meer, 2002). GB: Graph-based image segmentation (Felzenszwalb & Huttenlocher, 2004). PL: Power law clutter model (Bravo & Farid, 2008). ED: Edge density (Mack & Oliva, 2004). FC: Feature congestion clutter model (Rosenholtz et al., 2007). OC: Object counts provided by the SUN09 image collection (Xiao et al., 2010). C3: Color clustering clutter model (Lohrenz et al., 2009). All ps < 0.001 except for C3, which was p < 0.05.

References

    1. Achanta R., Shaji A., Smith K., Lucchi A., Fua P., Susstrunk S. (2012). SLIC superpixels compared to state-of-the-art superpixel methods. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 34 (11), 2274–2282. - PubMed
    1. Arbelaez P., Maire M., Fowlkes C., Malik J. (2011). Contour detection and hierarchical image segmentation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 33 (5), 898–916. - PubMed
    1. Beck M. R., Lohrenz M. C., Trafton J. G. (2010). Measuring search efficiency in complex visual search tasks: Global and local clutter. Journal of Experimental Psychology: Applied; Journal of Experimental Psychology: Applied , 16 (3), 238. - PubMed
    1. Bergen J. R., Landy M. S. (1991). Computational modeling of visual texture segregation. In Landy M. S., Movshon J. A. (Eds.), Computational models of visual processing (pp 253–271). Cambridge, MA: MIT Press.
    1. Bravo M. J., Farid H. (2008). A scale invariant measure of clutter. Journal of Vision , 8 (1): 4 1–9, http://www.journalofvision.org/content/8/1/23, doi:10.1167/8.1.23. [PubMed] [Article] - PubMed

Publication types