Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Jan 4;22(1):1.
doi: 10.1167/jov.22.1.1.

The many facets of shape

Affiliations
Review

The many facets of shape

James T Todd et al. J Vis. .

Abstract

Shape is an interesting property of objects because it is used in ordinary discourse in ways that seem to have little connection to how it is typically defined in mathematics. The present article describes how the concept of shape can be grounded within Euclidean and non-Euclidean geometry and also to human perception. It considers the formal methods that have been proposed for measuring the differences among shapes and how the performance of those methods compares with shape difference thresholds of human observers. It discusses how different types of shape change can be perceptually categorized. It also evaluates the specific data structures that have been used to represent shape in models of both human and machine vision, and it reviews the psychophysical evidence about the extent to which those models are consistent with human perception. Based on this review of the literature, we argue that shape is not one thing but rather a collection of many object attributes, some of which are more perceptually salient than others. Because the relative importance of these attributes can be context dependent, there is no obvious single definition of shape that is universally applicable in all situations.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Three polygons that are geometrically similar to one another. Objects A and B are also congruent.
Figure 2.
Figure 2.
Some invariants of affine and projective geometry. The left panel depicts a rectangular object that has been subjected to a shearing transformation. Note that the ratios of parallel line intervals are preserved. The right panel shows the central projection of a line with four points labeled A, B, C, and D. The cross-ratio of line intervals defined by those points is invariant over all projective transformations.
Figure 3.
Figure 3.
Front and side views of a normal human head (top) and one that was distorted by an affine shearing transformation. Note that the frontal views are indistinguishable from one another. This is an example of the bas-relief ambiguity first identified by Belhumeur, Kriegman, and Yuille (1997).
Figure 4.
Figure 4.
Impossible triangles created in two different ways. The top row shows a generic and accidental view of a sculpture by Brian McKay and Ahmad Abas (https://www.flickr.com/photos/themachobox/1068978352). The accidental view appears to have a cotermination of two bars that are actually separated in depth. The bottom row shows a generic and accidental view of a similar sculpture by Mathieu Hamaekers (https://im-possible.info/english/art/sculpture/hemaekers_unity.html). In that one, the accidental view appears to have straight edges, but they are actually curved in depth.
Figure 5.
Figure 5.
The correspondence matching task developed by Phillips, Todd, Koenderink, and Kappers (1997). Observers adjust the position of a dot on a surface so that it matches the position of a dot shown in an earlier interval.
Figure 6.
Figure 6.
Two possible deformations of a square shape labeled A and B. The upper panel of each pair shows the original square, and the lower panel shows the transformed version of it. Note that some parts of the figures change while others do not. The changed regions are colored red.
Figure 7.
Figure 7.
Some example stimuli used to study the perceptual salience of different types of shape change. All of the shapes in the periphery involve distortions of the one in the center. Moving clockwise from the upper left, the depicted distortions involve a change in the aspect ratio, making parallel lines converge, adding curvature to straight line segments, or punching a hole in an object. (Reprinted from Todd, Weismantel, & Kallie, 2014.)
Figure 8.
Figure 8.
The evolutionary shape change between diodon on the left and orthagoriscus on the right. (Reprinted from On Growth and Form by D'Arcy Wentworth Thompson, 1917.)
Figure 9.
Figure 9.
Two sequences of human head profiles. The one on the left was generated using a cardiodal strain transformation to simulate normal human growth. The one on the right was created with a modified version of that transformation that simulates human evolution.
Figure 10.
Figure 10.
The left panel shows a man's face from the painting American Gothic by Grant Wood (1930). The right panel shows a transformed version in which the distance between the eyebrows and the mouth was reduced. The original image on the left is judged to have a sad expression, whereas the one on the right is judged as angry. (Reprinted from Neth & Martinez, 2009.)
Figure 11.
Figure 11.
The medial axes of a dog and a camel computed using a traditional method (left) and a Bayesian estimation procedure. (Reprinted from Feldman and Singh, copyright (2006) National Academy of Sciences, U.S.A., with permission. This material is excluded from any creative common license.)
Figure 12.
Figure 12.
An example of an adversarial image. The left panel shows the image of a panda. The middle panel shows a pattern of noise that has been optimized so that it is categorized as a gibbon by GoogLeNet with the highest possible confidence. The right panel shows the image on the left with a small amount of noise from the middle panel added to it. This image is categorized as a panda by human observers and as a gibbon by GoogLeNet. (Reprinted from Goodfellow et al., 2017, available for use through open access.)
Figure 13.
Figure 13.
Three images from a match to sample task. The sample in the left panel contains half the contours from a drawing of a lock. The middle panel shows a spatially scrambled version of the sample, and the right panel shows a complementary version that contains all of its deleted contours. Human observers choose the complementary version as most similar to the standard, whereas the HMAX network chooses the spatially scrambled version. (Re-created from Hayworth, Yue, & Biederman, 2007, with permission from the authors.)
Figure 14.
Figure 14.
Two local probe tasks to measure the perception of 3D shape on curved surfaces. For the relative depth probe depicted on the left, observers must choose whether the red or green dot appears closer in depth. For the orientation probe on the right, observers must adjust a circular gauge figure until it appears to rest in the tangent plane within a designated local region. Note that the figure on the upper right appears to satisfy that criterion, but the one on the lower left does not.
Figure 15.
Figure 15.
The edge labeling model of Malik (1987). The left panel shows the symbols used for representing different types of edges. The middle panel shows some different types of vertices used by the model and their possible interpretations. The upper right panel shows a complex object with a consistent pattern of labels for all of its edges. The lower right panel shows an impossible object for which some of the edges have no interpretations that are consistent for both of their connected vertices.
Figure 16.
Figure 16.
The top panel shows the image of a textured object, with cast shadows and specular reflections. The center panel shows an edge filtered version of that image, and the bottom panel shows a line drawing with only the corners and occlusions.
Figure 17.
Figure 17.
Three objects constructed with cylindrical parts. Note that they are easily recognizable as a bull, a swan, and a dog.
Figure 18.
Figure 18.
A simple two-part object composed of a brick and a curved cylinder. The brick geon is defined by its parallel straight edges, its straight central axis, its three arrow vertices, and its one Y-vertex. The curved cylinder is defined by its curved central axis and edges and by its two three-tangent vertices.
Figure 19.
Figure 19.
The left panel shows a shaded image of a smoothly curved surface without any sharp corners or a closed boundary contour. Observers typically describe this as a circular ridge with many small bumps and saddles along its crest and a larger bump in the center. The right panel shows the same surface with a series of iso-height contours. The red, green, and blue dots mark height maxima, saddle points, and height minima, respectively.
Figure 20.
Figure 20.
The shape index proposed by Koenderink (1990) partitions local surface regions into five distinct categories that are intuitively identified as bumps, ridges, saddles, valleys, and dimples.
Figure 21.
Figure 21.
A shaded image of a curved surface and three types of contour drawings. Moving clockwise from the upper left, the panels depict a smoothly shaded image of the object, its silhouette, its rim, and the rim combined with curvature extremal contours, for which Kmax is a local maximum or Kmin is a local minimum. Note that the curvature extremal contours dramatically improve quality of the drawings.
Figure 22.
Figure 22.
Two line drawings of the head of “David” by Michelangelo (1504) created with different types of contours. The one on the left shows only the rim contours, whereas the one on the right also includes suggestive contours. (Reprinted from DeCarlo, Finkelstein, Rusinkiewicz & Santella, 2003, with permission from the authors.)
Figure 23.
Figure 23.
An aspect graph that shows the four possible characteristic views of a torus and the possible transitions between them.
Figure 24.
Figure 24.
The left panel shows The Librarian by Giuseppe Arcimboldo (1566), and the right panel shows a blurred version of it. Note that the original painting can be perceived as a person or an arrangement of books, but when the image is blurred, it can only be perceived as a person.
Figure 25.
Figure 25.
Girl With a Mandolin by Pablo Picasso (1910). Note how the subject is easily recognizable despite the large amounts of distortion.

Similar articles

Cited by

References

    1. Alcorn, M. A., Li, Q., Gong, Z., Wang, C., Mai, L., Ku, W., & Nguyen, A. (2019). Strike (with) a pose: Neural networks are easily fooled by strange poses of familiar objects. In CVPR, Computer Vision Foundation/IEEE (pp. 4845–4854).
    1. Anderson, M., & Feil, T. (2015). A first course in abstract algebra (3rd ed.). Boca Raton, FL: CRC Press.
    1. Attneave, F. (1954). Some informational aspects of visual perception. Psychological Review, 61, 183–193. - PubMed
    1. Belhumeur, P., Kriegman, D. J., & Yuille, A. L. (1997). The bas relief ambiguity. In Proceedings of the 1997 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 1060–1066). Washington, DC: IEEE Computer Society Press.
    1. Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94(2), 115–117, 10.1037/0033-295X.94.2.115 - DOI - PubMed

Publication types