Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2011 Dec 1;11(5):13.
doi: 10.1167/11.5.13.

Peripheral vision and pattern recognition: a review

Affiliations
Review

Peripheral vision and pattern recognition: a review

Hans Strasburger et al. J Vis. .

Erratum in

Abstract

We summarize the various strands of research on peripheral vision and relate them to theories of form perception. After a historical overview, we describe quantifications of the cortical magnification hypothesis, including an extension of Schwartz's cortical mapping function. The merits of this concept are considered across a wide range of psychophysical tasks, followed by a discussion of its limitations and the need for non-spatial scaling. We also review the eccentricity dependence of other low-level functions including reaction time, temporal resolution, and spatial summation, as well as perimetric methods. A central topic is then the recognition of characters in peripheral vision, both at low and high levels of contrast, and the impact of surrounding contours known as crowding. We demonstrate how Bouma's law, specifying the critical distance for the onset of crowding, can be stated in terms of the retinocortical mapping. The recognition of more complex stimuli, like textures, faces, and scenes, reveals a substantial impact of mid-level vision and cognitive factors. We further consider eccentricity-dependent limitations of learning, both at the level of perceptual learning and pattern category learning. Generic limitations of extrafoveal vision are observed for the latter in categorization tasks involving multiple stimulus classes. Finally, models of peripheral form vision are discussed. We report that peripheral vision is limited with regard to pattern categorization by a distinctly lower representational complexity and processing speed. Taken together, the limitations of cognitive processing in peripheral vision appear to be as significant as those imposed on low-level functions and by way of crowding.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
One of Lettvin's demonstrations. “Finally, there are two images that carry an amusing lesson. The first is illustrated by the O composed of small o's as below. It is a quite clearly circular array, not as vivid as the continuous O, but certainly definite. Compare this with the same large O surrounded by only two letters to make the word HOE. I note that the small o's are completely visible still, but that the large O cannot be told at all well. It simply looks like an aggregate of small o's.” (Lettvin, , p. 14) with the permission of the New York Academy of Sciences.
Figure 2.
Figure 2.
(a) The perimeter built by Hermann Aubert and Carl Foerster in Breslau in 1855 to measure letter acuity in dark adaptation. “We had digits and letters printed on 2 ft wide and 5 ft long paper at equal distances. That paper sheet could be scrolled by two cylinders, such that new characters could always be brought into the visual field. The frame was adjustable between 0.1 and 1 m viewing distance …” (Aubert & Foerster, 1857). The use of an electric arc (“Riesssche Flasche”) for brief presentation dates back to Volkmann and Ernst Heinrich Weber. (b) Aubert and Foerster's (1857) results for photopic two-point resolution (measured with a different apparatus). The inner circle corresponds to 9° visual angle; measurements go out to 22°. Note the linear increase up to 14.5° radius, and steeper increase further out.
Figure 3.
Figure 3.
Square-wave grating acuity results by Theodor Wertheim (1894) in Berlin. The markings on the lines of constant acuity (isopters) are, from the inside outwards: 1; 0.333; 0.2; 0.143; 0.1; 0.074; 0.056; 0.045; 0.04; 0.033; 0.026. These were relative readings where central acuity is set equal to 1. Stimuli were constructed from wire frames.
Figure 4.
Figure 4.
Cone and rod receptor density results by Østerberg (1935). These data underlie many of the current textbook figures.
Figure 5.
Figure 5.
MAR functions reviewed by Weymouth (1958). “Comparison of vernier threshold, minimal angle of resolution, motion threshold, and mean variation of the settings of horopter rods” (1958, Figure 13). With permission from Elsevier.
Figure 6.
Figure 6.
(a) Retinotopic organization of area V1 by Daniel and Whitteridge (1961). Vertical lines show eccentricity boundaries, horizontal curved lines show radians as in the visual half-field in (b). “This surface is folded along the heavy dotted lines so that F touches E, that D and C touch B, and A folds round so that it touches and overlaps the deep surface of B.” (1961, p. 213). With permission from Wiley.
Figure 7.
Figure 7.
Demonstration of peripheral letter acuity by Anstis (1974) (cut-out). Letter sizes are chosen such that they are at the size threshold (2 sj's, 216 cd/m2) during central fixation. Surprisingly, this is true almost regardless of viewing distance, as eccentricity angle and viewing angle vary proportionally with viewing distance. (To obtain the chart in original size, enlarge it such that the center of the lower “R” is 66 mm from the fixation point). With permission from Elsevier.
Figure 8.
Figure 8.
Characterization of the visual field by Pöppel and Harvey. (a) Perimetry data by Harvey and Pöppel (1972), i.e. light increment thresholds. Reproduced with permission from The American Academy of Optometry 1972. Note that on the temporal side the visual field extends further out than seen by the outer isopter, to around 107°; it is limited here by the test spot used. (b) Schematic representation of the visual field by Pöppel and Harvey (1973) based on the data in a. They distinguish five regions: (A) the fovea which shows highest photopic sensitivity; (B) the perifovea with a radius of around 10° where photopic thresholds increase with eccentricity; (C) a performance plateau extending to around 20° vertically and 35° horizontally where the dashed circle shows the nasal border; (D) peripheral field where thresholds increase up to the border of binocular vision; (E) monocular temporal border region. The two black dots are the blind spots.
Figure 9.
Figure 9.
Examples of M scaling functions. By definition, only size is considered in the scaling (modified from Strasburger, 2003b). For easy comparison these functions disregard the horizontal/vertical anisotropy. Curve (a): The function used by Rovamo and Virsu (1979), M−1 = (1 + aE + bE3) · M0−1, with the values a = 0.33; b = 0.00007; Mo = 7.99 mm/° (for the nasal horizontal meridian). Curve (b) (dashed line): Power function with exponent 1.1 used by van Essen et al. (1984) for their anatomical results, M−1 = (1 + aE)1.1 · M0−1, but with parameters a and Mo like in (a) for a comparison of the curves' shapes. Curve (c): Same function as in (b) but with values given by van Essen et al. (1984) for the macaque, a = 1.282 and Mo = 15.55 mm/°. Curve (d): Same function as in (b) but with values estimated by Tolhurst and Ling (1988) for the human, Mo estimated by 1.6-fold larger: Mo = 24.88 mm/°. Curve (e) (green, dashed): Inverse linear function with values from Horton and Hoyt (1991): E2 = 0.75 and M0 = 23.07 mm/°. Curve (f) (red, long dashes): Inverse linear function with values from Schira, Wade, and Tyler (2007): E2 = 0.77 and M0 = 24.9 mm/° (root of areal factor). Curve (g) (blue, long dashes): Inverse linear function with own fit to Larsson and Heeger's (2006) area-V1 location data: M0 = 22.5; E2 = 0.785. Curve (h) (purple, dash-dotted): Inverse linear function with values from Duncan and Boynton (2003): M0 = 18.5; E2 = 0. 0.831.
Figure 10.
Figure 10.
Estimation of ganglion cell density by Drasdo (1989). The continuous line shows the inverse of the linear ganglion cell density as a function of eccentricity. According to the model, the hatched area under the curve is equal to the area under the dashed-line (from Strasburger, , modified from Drasdo, , Figure 1). With permission from Elsevier.
Figure 11.
Figure 11.
Schematic illustration of the E2 value. Four functions with same E2 are shown, two linear functions with different foveal values, and two non-linear functions with same foveal value (from Strasburger, , Chpt. 4).
Figure 12.
Figure 12.
Spatial summation for the detection of a homogeneous spot of light in central and peripheral vision. (a) Schematic illustration of Riccò's and Piper's law of spatial summation. (b) Spatial summation in peripheral view for two observers (monocular, 15° nasal, dark adapted, 12.8 ms). Data by Graham and Bartlett (, Table 2). (c) Diameter of receptive and perceptive fields for the human, monkey, and cat. Open squares: Human perceptive fields, mean of temporal and nasal data provided by Oehler (, Figure 4). Open circles: Monkey perceptive fields, obtained by using the Westheimer paradigm (Oehler, , Figure 8). Filled circles: Monkey receptive field (De Monasterio & Gouras, , Figure 16, broad-band cells). Crosses and filled triangles: receptive fields of the cat (Fischer & May, 1970, Figure 2). Analyses by Strasburger (2003b), figures modified from Strasburger (2003).
Figure 13.
Figure 13.
(a) 3D representation of the contrast-size trade-off functions for one subject (WB) (from Strasburger, ; like Strasburger et al., , Figure 1, but interpolated in the blind spot). (b) Full set of contrast-size functions for the same subject (from Strasburger et al., 1994).
Figure 14.
Figure 14.
Visual fields of recognition and detection for one subject (CH). Recognition fields (heavy lines) are obtained from threshold-contrast-vs.-size trade-off functions as shown in Figure 13. The form of the field is approximated by ellipses. Each ellipse shows the border of recognition at a given level of contrast, at the values 1.2%, 2%, 3%, 4%, 6%, 10%, 30% starting from the inner circle (contrast in Michelson units). Note the performance plateau on the horizontal meridian between 10° and 25° (between the 3% and 4% line), similar to the one found in perimetry (Harvey & Pöppel, ; Pöppel & Harvey, 1973). The 100%-contrast ellipse represents a maximum field of recognition obtained by extrapolation; its diameter is 46° × 32°. Also indicated in dashed lines are the fields of light-spot detection in standard static perimetry for the same subject. (From Strasburger & Rentschler, , Figure 4.) Note that the dashed line does not represent the full visual field of detection since a small test spot is used for the perimetric data; the full field would extend to around ±107°.
Figure 15.
Figure 15.
Contrast thresholds for the recognition of characters (lower curves) compared to the detection of Gabor gratings (upper curves); (a) mean over all meridians, (b) horizontal meridian. Character height 2.4°, Gabor patches: 1 cpd, σ = 1.5°. From (Strasburger, ; Strasburger et al., 2001).
Figure 16.
Figure 16.
Contrast thresholds for the recognition of characters (a) compared to the detection of Gabor gratings (b) in the full field up to 30°; same conditions as in Figure 15. Error bars show standard deviations. From Strasburger, .
Figure 17.
Figure 17.
Prediction of the threshold contrast for character recognition in the central 30° radius visual field. (C: Michelson threshold contrast, E: eccentricity (°), S: size threshold, Pc: percent correct, c: supra-threshold contrast, ln: natural log, β: slope measure). Adapted from Strasburger, , ; Strasburger & Rentschler, . For the psychometric function and its slope measure see Strasburger (2001a, 2001b).
Figure 18.
Figure 18.
(a) Example of a contrast-size trade-off function in the fovea. Plotted is log Weber contrast vs. log size, so as to allow comparison with Riccò's law. (b) Maximum slope in the contrast-size trade-off function as in the figure on the left, at a range of eccentricities on the horizontal meridian (modified from Strasburger, , Figures 5.4-13 and 5.4-14).
Figure 19.
Figure 19.
Stimulus configurations in letter crowding studies. (a) Averbach and Coriell (1961); (b) Flom et al. (1963); (c) Eriksen and Rohrbaugh (1970); (d) Wolford and Chambers (1983); (e) Strasburger et al. (1991), with permission from Springer Science+Business Media; (f) Toet and Levi (1992), with permission from Elsevier; (g) Anstis' (1974) crowding demonstration chart. Bouma's (1970) stimuli are not shown, with permission from Elsevier; he used twenty-five lower case letters in Courier-10 font of 0.22° height. (Graphics modified from Strasburger, 2003b).
Figure 20.
Figure 20.
Sample crowding interaction ranges (enlarged for better visibility by a factor of two) at three eccentricities for one subject, given by Toet and Levi (, Figure 6). Toet & Levi's stimulus configuration (for closest lateral distance) is shown in Figure 19f. With permission from Pion Ltd, London.
Figure 21.
Figure 21.
Cue effects in low-contrast letter crowding vs. flanker distance (from Strasburger & Malania, 2011). (a) Cue gain-control effect on contrast thresholds; (b) positional errors; (c) “Doughnut model”: The transparent gray mask visualizes log-contrast gain control from transient attention taken from (a). On the left is the fixation point. Note the (bright) excitatory spotlight on the target and the (dark) inhibitory surround.
Figure 22.
Figure 22.
Crowding in words and faces (modified from Martelli et al., , Experiment 2). (a). Illustration of critical distance. When fixating the square, the identification of a target feature (here: the central letter in the words (top), or the shape of the mouth in the face caricature (bottom)) is impaired by surrounding features (left) unless there is sufficient spatial separation (right). (b). Threshold contrast for target identification as a function of part spacing. For each eccentricity, the floor break point of the fitted lines defines the critical spacing. (c). Critical spacing as a function of eccentricity of target. The data show a linear increase of critical distance with eccentricity (average slope: 0.34). The gray diamonds refer to estimates based on the data of the face identification study by Mäkelä et al. (2001). (d). Critical distance as a function of size of target (eccentricity 12°). The data show that critical distance is virtually unaffected by size (average slope: 0.007).
Figure 23.
Figure 23.
Dissociation of category and discrimination learning (modified from Jüttner & Rentschler, 2000). (a). The learning signals were given by a set of fifteen compound Gabor gratings, defined in a two dimensional Fourier feature space. Within this feature space, the learning stimuli formed three clusters thus defining three classes. Two different sets of signals, A and B, were generated. They had the same configuration with respect to their of cluster means (dashed triangle) and only differed in their mean class variance σm. For signal set B the latter was reduced by a factor of 100 relative to set A, as indicated by the circles. (b). Illustration of the actual graylevel representations of the patterns in set A. (c). Learning tasks. For category learning (top), the subjects were trained with all three classes (I–III) simultaneously. For discrimination learning (bottom) the subjects were trained only with pairs of pattern classes (i.e., I vs. II, II vs. III, and I vs. III) in three consecutive experiments. (d). Mean learning time as a function of eccentricity of training location. For set A (solid lines), observers show fast discrimination learning regardless of training location. By contrast, for categorization learning duration is greatly increased in extrafoveal viewing conditions. For set B (dashed lines) the dissociation between the two tasks is still significant but markedly reduced.
Figure 24.
Figure 24.
Imperfect translation-invariance for recognizing configural changes in sequential pattern matching (modified from Dill & Edelman, , Experiments 3 and 4). The two patterns to be matched were shown either at the same location (“control” condition), or at separate locations involving either horizontally, vertically or diagonally adjacent quadrants. (a). Examples of scrambled animal-like patterns. Stimuli within each column differ in their parts but share the same part configuration. Stimuli within each row consist of the same parts in different configurations. (b). Rate of correct responses as a function of spatial separation in the “same configuration – different part” condition. Solid line: “same” responses; dashed line: “different” responses. The data show a significant interaction of the two response types. However, the corresponding d′ values (red line) reveal no significant variation with separation. (c). As before but for correct responses in the “different configuration – same parts” condition. Again, the data show a significant interaction between “same” and “different” responses. Crucially, the corresponding d′ values display a significant effect of spatial separation. With permission from Pion Ltd, London.
Figure 25.
Figure 25.
Disconnected and connected figural elements and point-wise samples thereof (from Minsky & Papert, 1971).
Figure 26.
Figure 26.
Original images (left column) as seen with “complex cells-only” vision (right column). These simulations are obtained from a model of amblyopic vision and provide a first approximation of peripheral form vision (from Treutwein et al., 1996).
Figure 27.
Figure 27.
Crowding as a result of summary statistics within a model of texture analysis and synthesis (from Balas et al., 2009).
Figure 28.
Figure 28.
Internal representations of pattern categories acquired in direct (centre column) and indirect view (left and right column) by two subjects (AD and KR) in a three-class learning paradigm involving a set of 15 compound Gabor patterns. The corners of the dotted triangles represent the class means of the pattern categories within the generating evenness/oddness Fourier feature space. Internalized class prototypes (open and closed symbols) were obtained by fitting the PVP model to the psychophysical classification matrix cumulated across the learning sequence of each observer. Learning duration, as indicated by the number of learning units to criterion (numbers at the triangle tip), increases nearly ten-fold in indirect view (from Jüttner & Rentschler, 1996).
Figure 29.
Figure 29.
Dynamics of category learning in indirect view. Internal representations of pattern classes as in Figure 28. Observer C.Z. took 13 learning units to criterion. PVP configurations are obtained from locally averaging classification matrices by means of a Gaussian kernel with fixed spread parameter. Step size Δk is one learning unit. Decimal notations in brackets indicate the learning unit number and the root of the mean squared error of fit (from Unzicker et al., 1999). With permission from Elsevier.
Figure A1.
Figure A1.
(a) Five of the ten examples of perceptual shortening provided by Korte (, p. 67) showing meaningless syllables (sif, läunn, diecro, goruff, läff) and how they were reported by Korte's subjects (“sif” reported four times as “ff”, twice as “ss”, etc.). (b) Examples of false localization of detail with regard to whole letters (p. 42). (c) Examples of false localization of detail within letters (p. 41). Material in the the three graphs is copied from the original text and arranged, since the font (Fraktur, lower case) is not available in modern font sets.

Similar articles

Cited by

References

    1. Abbey, C. K., & Eckstein, M. P. (2002). Classification image analysis: Estimation and statistical inference for two-alternative forced-choice experiments. Journal of Vision, 2(1): 5, 66–78, http://www.journalofvision.org/content/2/1/5, doi:10.1167/2.1.5. [PubMed] [Article] - DOI - PubMed
    1. Adelson, E. H., & Bergen, J. R. (1985). Spatiotemporal energy models for the perception of motion. Journal of the Optical Society of America A, 2, 284–299. - PubMed
    1. Adler, J. S., Kripke, D. F., Loving, R. T., & Berga, S. L. (1992). Peripheral vision suppression of melatonin. Journal of Pineal Research, 12, 49–52. - PubMed
    1. Ahumada, A. J. Jr. (1996). Perceptual classification images from Vernier acuity masked by noise. Perception, 26, 18.
    1. Ahumada, A. J. Jr. (2002). Classification image weights and internal noise level estimation. Journal of Vision, 2(1): 8, 121–131, http://www.journalofvision.org/content/2/1/8, doi:10.1167/2.1.8. [PubMed] [Article] - DOI - PubMed

Publication types

MeSH terms