Crowding, grouping, and object recognition: A matter of appearance

Michael H Herzog, Bilge Sayim, Vitaly Chicherov, Mauro Manassi

PMID: 26024452
PMCID: PMC4429926
DOI: 10.1167/15.6.5

Crowding, grouping, and object recognition: A matter of appearance

Michael H Herzog et al. J Vis. 2015.

. 2015;15(6):5.

doi: 10.1167/15.6.5.

Authors

Michael H Herzog, Bilge Sayim, Vitaly Chicherov, Mauro Manassi

PMID: 26024452
PMCID: PMC4429926
DOI: 10.1167/15.6.5

Abstract

In crowding, the perception of a target strongly deteriorates when neighboring elements are presented. Crowding is usually assumed to have the following characteristics. (a) Crowding is determined only by nearby elements within a restricted region around the target (Bouma's law). (b) Increasing the number of flankers can only deteriorate performance. (c) Target-flanker interference is feature-specific. These characteristics are usually explained by pooling models, which are well in the spirit of classic models of object recognition. In this review, we summarize recent findings showing that crowding is not determined by the above characteristics, thus, challenging most models of crowding. We propose that the spatial configuration across the entire visual field determines crowding. Only when one understands how all elements of a visual scene group with each other, can one determine crowding strength. We put forward the hypothesis that appearance (i.e., how stimuli look) is a good predictor for crowding, because both crowding and appearance reflect the output of recurrent processing rather than interactions during the initial phase of visual processing.

PubMed Disclaimer

Figures

**Figure 1**
(A) Basic pooling model. Elements (e.g., letters A, V, and E) activate input units that subsequently feed into a pooling unit. Because of the larger receptive field of the pooling unit, the features of the letters are jumbled. (B) Neurophysiology. Neurons in V1 are sensitive to simple features such as edges and lines. In higher visual areas, neurons are sensitive to more and more complex features, such as simple shapes in V4 and objects in IT. Receptive field sizes increase from lower visual areas to higher visual areas. (C) Hierarchical models of object recognition formalize the neurophysiological findings (see, e.g., Riesenhuber & Poggio, 1999). Stimulus processing starts with the analysis of very simple features (edges and lines) and proceeds to more and more complex visual representations (shapes). A hypothetical “square neuron” receives input from neurons tuned to angles, which in turn receive inputs from basic line detectors. Receptive field sizes increase as they integrate more and more information across the visual field. At each step in the hierarchy, only signals from the previous areas are combined. Responses in higher areas are fully determined by the input from lower areas. Information lost on early stages is irretrievably lost.

**Figure 2**
(A) Observers were asked to discriminate whether a rectangle was wider along the horizontal (x) or vertical (y) axis. We determined the threshold width for which 75% correct responses were obtained. When the rectangle was flanked by three squares on each side, performance strongly deteriorated compared to when presented alone. This is a classic crowding effect. (B) Next, we asked observers to discriminate whether a vernier was offset to the left or right (a). We determined the offset size for which 75% correct responses occurred (left bar and dashed line). Performance deteriorated (i.e., thresholds increased) when the vernier was surrounded by a square (b). This is another classic crowding effect. Surprisingly, vernier discrimination improved when we combined the two conditions. Performance improved gradually, with the more squares that were presented. Best performance occurred with 2 × 3 contextual squares. In this condition, the fixation dot is close to the leftmost square and the rightmost square is at 17.5° (i.e., well outside Bouma's window). (C) First, we repeated the basic conditions (a–c). Next, crowding was strong when we removed the horizontal lines of the flanking squares (d) or rotated the flanking squares by 90° (e). Data from (d) and (e) were collected in different experiments with different observers and are shown here together to ease presentation. In part (B), we adjusted square size individually to enhance effects. This explains the higher thresholds compared to (C). Modified from Manassi et al. (2013).

**Figure 3**
Crowding and uncrowding depend on many grouping cues (for demonstrations see Figure 5). (A) Good Gestalt. A vernier flanked by two lines of the same length yields high thresholds, that is, strong crowding (a). When the two lines are integrated in a rectangle, thresholds strongly decrease (b). Crossing the horizontal lines of the rectangle increases thresholds, similar to the single lines condition (c). Closing the rectangle by additional horizontal lines reduces crowding again (foveal vision: Sayim et al., ; peripheral vision: Manassi et al., 2012). The dashed line indicates performance for the unflanked vernier. (B) Pattern regularity. Thresholds for a red vernier flanked by single red lines (a) and 10 red lines (b) are high compared to the unflanked vernier condition (shown by the dashed line). When the flankers are green (c–d), thresholds are much lower. A grating with alternating red and green lines leads to high thresholds (e). The red (f) and green (g) parts of the alternating grating themselves crowd very little. Only when parts of the alternating grating are *combined*, do they form a pattern that leads to strong crowding (adapted from Manassi et al., 2012). (C) Spacing regularity. Observers discriminated the orientation of a central letter T. Threshold elevation is high when the spacing between flanking letters is small and regular (*tight* condition). Increasing the spacing between the target and the innermost flankers decreased crowding (*shifted* condition). Crowding increased when we increased the spacing between the remaining flankers creating a regular pattern (*wide* condition). Adding more flankers in the gaps between the flankers (*added* condition) decreased crowding again (modified from Saarela et al., 2010). (D) Contour integration. Gabor orientation discrimination is weak when the central Gabor is surrounded by radially arranged flankers. When the flankers make up a smooth contour, crowding is reduced (adapted from Livne & Sagi, 2007).

**Figure 4**
Electrophysiological correlates of crowding. (A) A vernier target was presented in the fovea and flanked by arrays of short, equal length, or long lines. (B) Accuracy was highest for long, intermediate for short, and worst for equal length flankers in line with our grouping hypothesis. (C) Event-related potentials were recorded and global field power (GFP) computed, which reflects overall brain activity. The time axis is referenced to stimulus onset. The early visual response (the P1 component) reflects flanker length. P1 amplitudes are highest for long flankers, intermediate for equal length, and lowest for short flankers. Crowding strength is (inversely) reflected in the N1 component around 180–200 ms, which is highest for long, intermediate for short, and lowest for equal length flankers. Hence, it seems that it takes about 50–80 ms to transform the initial encoding into an object-based perceptual code. (D) Source localization in the N1 time window. The color scale reflects activation differences in the brain associated with crowding (difference between the long flanker and equal length flanker conditions). Particularly, sources in the lateral occipital and posterior temporal and parietal areas reflect crowding strength. Sources in the V1 do not contribute significantly. Modified from Chicherov et al. (2014).

**Figure 5**
For illustrative purposes, we have plotted various stimuli for the studies. Fixate the central cross and compare stimuli on the right to those on the left hand side.

See this image and copyright information in PMC

Cited by

Broad attention uncovers benefits of stimulus uniformity in visual crowding.
Rummens K, Sayim B. Rummens K, et al. Sci Rep. 2021 Dec 14;11(1):23976. doi: 10.1038/s41598-021-03258-z. Sci Rep. 2021. PMID: 34907221 Free PMC article.
Contextual Interactions in Grating Plaid Configurations Are Explained by Natural Image Statistics and Neural Modeling.
Ernst UA, Schiffer A, Persike M, Meinhardt G. Ernst UA, et al. Front Syst Neurosci. 2016 Oct 4;10:78. doi: 10.3389/fnsys.2016.00078. eCollection 2016. Front Syst Neurosci. 2016. PMID: 27757076 Free PMC article.
The generality of the critical spacing for crowded optotypes: From Bouma to the 21st century.
Coates DR, Ludowici CJH, Chung STL. Coates DR, et al. J Vis. 2021 Oct 5;21(11):18. doi: 10.1167/jov.21.11.18. J Vis. 2021. PMID: 34694326 Free PMC article.
Offline transcranial direct current stimulation improves the ability to perceive crowded targets.
Chen G, Zhu Z, He Q, Fang F. Chen G, et al. J Vis. 2021 Feb 3;21(2):1. doi: 10.1167/jov.21.2.1. J Vis. 2021. PMID: 33533878 Free PMC article. Clinical Trial.
Empirical Evidence for Intraspecific Multiple Realization?
Strappini F, Martelli M, Cozzo C, di Pace E. Strappini F, et al. Front Psychol. 2020 Jul 24;11:1676. doi: 10.3389/fpsyg.2020.01676. eCollection 2020. Front Psychol. 2020. PMID: 32793053 Free PMC article.

See all "Cited by" articles

References

1. Adelson E. H.(1993). Perceptual organization and the judgment of brightness. Science, 262(5142), 2042–2044. - PubMed
1. Altmann C. F.,, Bülthoff H. H.,, Kourtzi Z.(2003). Perceptual organization of local elements into global shapes in the human visual cortex. Current Biology, 13(4), 342–349. - PubMed
1. Anderson E. J.,, Dakin S. C.,, Schwarzkopf D. S.,, Rees G.,, Greenwood J. A.(2012). The neural correlates of crowding-induced changes in appearance. Current Biology, 22(13), 1199–1206, doi:http://dx.doi.org/10.1016/j.cub.2012.04.063. - DOI - PMC - PubMed
1. Andriessen J.,, Bouma H.(1976). Eccentric vision: Adverse interactions between line segments. Vision Research, 16(1), 71–78, doi:http://dx.doi.org/10.1016/0042-6989(76)90078-X. - DOI - PubMed
1. Balas B.,, Nakano L.,, Rosenholtz R.(2009). A summary-statistic representation in peripheral vision explains visual crowding. Journal of Vision, 9(12): 51–18, http://www.journalofvision.org/content/9/12/13, doi: 10.1167/9.12.13.[PubMed] [Article] - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Crowding, grouping, and object recognition: A matter of appearance

Crowding, grouping, and object recognition: A matter of appearance

Authors

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources