Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 2;21(8):15.
doi: 10.1167/jov.21.8.15.

Closing the gap between single-unit and neural population codes: Insights from deep learning in face recognition

Affiliations

Closing the gap between single-unit and neural population codes: Insights from deep learning in face recognition

Connor J Parde et al. J Vis. .

Abstract

Single-unit responses and population codes differ in the "read-out" information they provide about high-level visual representations. Diverging local and global read-outs can be difficult to reconcile with in vivo methods. To bridge this gap, we studied the relationship between single-unit and ensemble codes for identity, gender, and viewpoint, using a deep convolutional neural network (DCNN) trained for face recognition. Analogous to the primate visual system, DCNNs develop representations that generalize over image variation, while retaining subject (e.g., gender) and image (e.g., viewpoint) information. At the unit level, we measured the number of single units needed to predict attributes (identity, gender, viewpoint) and the predictive value of individual units for each attribute. Identification was remarkably accurate using random samples of only 3% of the network's output units, and all units had substantial identity-predicting power. Cross-unit responses were minimally correlated, indicating that single units code non-redundant identity cues. Gender and viewpoint classification required large-scale pooling of units-individual units had weak predictive power. At the ensemble level, principal component analysis of face representations showed that identity, gender, and viewpoint separated into high-dimensional subspaces, ordered by explained variance. Unit-based directions in the representational space were compared with the directions associated with the attributes. Identity, gender, and viewpoint contributed to all individual unit responses, undercutting a neural tuning analogy. Instead, single-unit responses carry superimposed, distributed codes for face identity, gender, and viewpoint. This undermines confidence in the interpretation of neural representations from unit response profiles for both DCNNs and, by analogy, high-level vision.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
(A) Identification accuracy is plotted as a function of subspace dimensionality, measured as area under the ROC curve (AUC). Performance is nearly perfect (AUC 1.0) with the full 512-dimensional descriptor and shows negligible declines until subspace dimensionality reaches 16 units. Performance with as few as two units remains above chance. (B) Correlation histogram for unit responses across images indicates that units capture non redundant information for identification.
Figure 2.
Figure 2.
Effect sizes for units (A) and principal components (B) for identity, gender, and viewpoint. For both units and principal components, top panels illustrate the dominance of identity over gender and viewpoint. Lower panels show an approximately uniform distribution of effect sizes for units (A) and differentiated effect sizes for principal components (B) in all three attributes.
Figure 3.
Figure 3.
Gender and viewpoint prediction with variable numbers of randomly sampled units. Gender classification declines gradually (A) and viewpoint prediction declines rapidly (B) as sample size decreases. Mean performance across samples (n=50) is shown with a diamond, colored by sample size. Because these performance measures are qualitatively different, they should not be compared in absolute terms (for comparison between gender, viewpoint, and identity, see effect sizes; Figure  2).
Figure 4.
Figure 4.
(A) Sliding windows of PCs used to predict identity (purple), gender (teal), and yaw (yellow) across the PC subspaces. Identification accuracy is highest when using early PCs. Gender and viewpoint classification are best when using subspaces with the highest effect sizes for gender and viewpoint separation, respectively. (B) Similarity between PCs and directions diagnostic for identity (purple), gender (teal), and yaw (yellow). Identity direction is the average similarity between identity templates and PCs. Gender direction is the linear discriminant line from the LDA for gender classification. Viewpoint direction is the weight vector from the linear regression for viewpoint prediction.
Figure 5.
Figure 5.
(Top) For a single example unit, absolute value of similarities between unit direction and each PC shows confounding of unit response with identity, gender, and viewpoint. (Bottom) Density plot of similarities between the example unit and PCs associated with identity (purple), gender (blue), and viewpoint (yellow). The distributions overlap almost completely, indicating that each type of information contributes to the unit's activation. This finding was consistent across all unit basis vectors.

References

    1. Abudarham, N., & Yovel, G. (2020). Face recognition depends on specialized mechanisms tuned to view-invariant facial features: Insights from deep neural networks optimized for face or object recognition. bioRxiv. - PubMed
    1. Bansal, A., Castillo, C., Ranjan, R., & Chellappa, R. (2017). The do's and don'ts for cnn-based face verification. In Proceedings of the IEEE International Conference on Computer Vision Workshops (pp. 2545–2554).
    1. Bansal, A., Nanduri, A., Castillo, C. D., Ranjan, R., & Chellappa, R. (2017). Umdfaces: An annotated face dataset for training deep networks. In IEEE International Joint Conference on Biometrics (IJCB) (pp. 464–473). IEEE.
    1. Bashivan, P., Kar, K., & DiCarlo, J. J. (2019). Neural population control via deep image synthesis. Science, 364(6439), 1–13. - PubMed
    1. Casper, S., Boix, X., D'Amario, V., Guo, L., Schrimpf, M., Vinken, K., & Krieman, G. (2019). Frivolous units: Wider networks are not really that wide. arXiv preprint arXiv:1912.04783.

Publication types