Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul 9;110(28):11618-23.
doi: 10.1073/pnas.1217479110. Epub 2013 Jun 24.

Trade-off between curvature tuning and position invariance in visual area V4

Affiliations

Trade-off between curvature tuning and position invariance in visual area V4

Tatyana O Sharpee et al. Proc Natl Acad Sci U S A. .

Abstract

Humans can rapidly recognize a multitude of objects despite differences in their appearance. The neural mechanisms that endow high-level sensory neurons with both selectivity to complex stimulus features and "tolerance" or invariance to identity-preserving transformations, such as spatial translation, remain poorly understood. Previous studies have demonstrated that both tolerance and selectivity to conjunctions of features are increased at successive stages of the ventral visual stream that mediates visual recognition. Within a given area, such as visual area V4 or the inferotemporal cortex, tolerance has been found to be inversely related to the sparseness of neural responses, which in turn was positively correlated with conjunction selectivity. However, the direct relationship between tolerance and conjunction selectivity has been difficult to establish, with different studies reporting either an inverse or no significant relationship. To resolve this, we measured V4 responses to natural scenes, and using recently developed statistical techniques, we estimated both the relevant stimulus features and the range of translation invariance for each neuron. Focusing the analysis on tuning to curvature, a tractable example of conjunction selectivity, we found that neurons that were tuned to more curved contours had smaller ranges of position invariance and produced sparser responses to natural stimuli. These trade-offs provide empirical support for recent theories of how the visual system estimates 3D shapes from shading and texture flows, as well as the tiling hypothesis of the visual space for different curvature values.

Keywords: Gabor model; feature selectivity; natural stimuli; object recognition; vision.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Estimating feature selectivity of V4 neurons with natural stimuli. (A) Schematic representation of LN models used to characterize feature selectivity and invariance of V4 neurons. Shown here is an LN model with two relevant stimulus features that, in conjunction, can trigger the neural response when positioned at a number of different locations within the visual field. (B and C) Feature selectivity of example V4 neurons with smaller (B, 6%) and larger (C, 12%) ranges of position invariance, respectively. The range of position is measured relative to the spatial field that is shown in B and C encompassing the relevant stimulus features. Color maps indicate the first (Upper) and second (Lower) most relevant stimulus features. The color scale indicates values (either positive or negative) relative to the mean luminance, divided by the average SD across different pixels obtained from different jackknife estimates. The red scale bar is 1°. The overlaid contour plots were obtained by fitting curved 2D Gabor functions to the relevant stimulus features. Neurons m17b_1 (B) and j46a_1 (C). (D) Trade-off between curvature and invariance in V4. The curvature index 1 here is the curvature parameter formula image from the fitted curved Gabor models normalized such that the stimulus frame size has a unit length of 1. We find that this curvature index decreases with the range of position invariance for both the first MIID (gray triangles) and the second MIID (open circles). The solid line shows the least-square fit for all points (P = 0.005 linear correlation); the dashed line is the fit just through points for the first MIID (P = 0.042, linear correlation).
Fig. 2.
Fig. 2.
The effect of threshold for spiking on position invariance and sparseness. Two competing models are schematically represented. (A) The range of position invariance is determined by threshold. (B) Position invariance is determined by the range of inputs in space; threshold is scaled proportionately to the strongest input. (C) Negative correlation between threshold and invariance range (P = 0.004, linear correlation) is consistent with both models. Each point is a V4 neuron. (D) Threshold and sparseness are uncorrelated (P = 0.37), which is consistent with model B and not A. (E) Sparseness is larger for neurons that were best described by models with zero position invariance than for neurons best described by models with limited (from 5% to 15%) position invariance (P = 0.0027, Mann–Whitney test). The decrease of sparseness formula image with invariance range x was better described by an inverse quadratic function (solid line shows the best fit) rather than a linear function; P = 0.03, correlation between invariance range andformula image
Fig. 3.
Fig. 3.
The two relevant features often form “quadrature” pairs. Across the population, the two relevant stimulus features have on average the same preferred orientation (A) and spatial frequency values (B). The corresponding P values are 0.44 and 0.59 for linear correlation. Points represent different neurons and are colored by the difference between MIID1 and MIID2 of the same neuron in terms of spatial phase (A) and preferred orientation (B). (Inset) Histogram of phase differences between MIID1 and MIID2 shows a peak at ∼50°, reminiscent of selectivity to a quadrature pair of relevant stimulus features typical of V1 complex cells.
Fig. 4.
Fig. 4.
Predictive power of LN models with position invariance. (A) The distribution of the percent of mutual information explained by LN models with a variable range of position invariance. (B) A comparison of information explained by the best model to the overall (model-free) information contained in the firing rate. The solid line at 45° corresponds to the case in which all of the information is explained. (C) The distribution of optimal translation ranges across the population of V4 neurons suggests that fewer neurons are needed to represent the visual space when they have wider range of translation invariance.

References

    1. DiCarlo JJ, Zoccolan D, Rust NC. How does the brain solve visual object recognition? Neuron. 2012;73(3):415–434. - PMC - PubMed
    1. Serre T, Wolf L, Bileschi S, Riesenhuber M, Poggio T. Robust object recognition with cortex-like mechanisms. IEEE Trans Pattern Anal Mach Intell. 2007;29(3):411–426. - PubMed
    1. Rust NC, Dicarlo JJ. Selectivity and tolerance (“invariance”) both increase as visual information propagates from cortical area V4 to IT. J Neurosci. 2010;30(39):12978–12995. - PMC - PubMed
    1. Hubel DH, Wiesel TN. Receptive fields and functional architecture of monkey striate cortex. J Physiol. 1968;195(1):215–243. - PMC - PubMed
    1. McManus JN, Li W, Gilbert CD. Adaptive shape processing in primary visual cortex. Proc Natl Acad Sci USA. 2011;108(24):9739–9746. - PMC - PubMed

Publication types