Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 May 20:2023.05.18.541176.
doi: 10.1101/2023.05.18.541176.

Energy Guided Diffusion for Generating Neurally Exciting Images

Affiliations

Energy Guided Diffusion for Generating Neurally Exciting Images

Paweł A Pierzchlewicz et al. bioRxiv. .

Abstract

In recent years, most exciting inputs (MEIs) synthesized from encoding models of neuronal activity have become an established method to study tuning properties of biological and artificial visual systems. However, as we move up the visual hierarchy, the complexity of neuronal computations increases. Consequently, it becomes more challenging to model neuronal activity, requiring more complex models. In this study, we introduce a new attention readout for a convolutional data-driven core for neurons in macaque V4 that outperforms the state-of-the-art task-driven ResNet model in predicting neuronal responses. However, as the predictive network becomes deeper and more complex, synthesizing MEIs via straightforward gradient ascent (GA) can struggle to produce qualitatively good results and overfit to idiosyncrasies of a more complex model, potentially decreasing the MEI's model-to-brain transferability. To solve this problem, we propose a diffusion-based method for generating MEIs via Energy Guidance (EGG). We show that for models of macaque V4, EGG generates single neuron MEIs that generalize better across architectures than the state-of-the-art GA while preserving the within-architectures activation and requiring 4.7x less compute time. Furthermore, EGG diffusion can be used to generate other neurally exciting images, like most exciting natural images that are on par with a selection of highly activating natural images, or image reconstructions that generalize better across architectures. Finally, EGG is simple to implement, requires no retraining of the diffusion model, and can easily be generalized to provide other characterizations of the visual system, such as invariances. Thus EGG provides a general and flexible framework to study coding properties of the visual system in the context of natural images.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Schematic of the EGG diffusion method with a pre-trained diffusion model. Examples of applications: Left: Most Exciting Inputs for different neurons, Middle: Most Exciting Natural Inputs matched unit-wise to the MEIs. Right: Reconstructions in comparison to the ground truth (top) and gradient descent optimized (bottom).
Figure 2:
Figure 2:
a) Schematic of the Attention Readout. b) Correlation to average scores for 1244 neurons. The data-driven with attention readout (pink) model shows a significant (as per the Wilcoxon signed rank test, p-valu = 6.79 · 10−82) increase in the mean correlation to average in comparison to the task-driven ResNet (blue) model. c) Predictive performance comparison of the two models in a closed-loop MEI evaluation setting. Showing that the data-driven with attention readout model better predicts the in-vivo responses of the MEIs.
Figure 3:
Figure 3:
a) Examples of MEIs optimized using EGG diffusion and GA for macaque V4 ResNet and ACNN models. b) Comparison of activations for different neurons between EGG diffusion and GA on the Within and Cross Architecture validation paradigms. Line fits obtained via Huber regression with ε=1.1.
Figure 4:
Figure 4:
Mean comparison of the generation times between the EGG and GA (errorbars denote standard error).
Figure 5:
Figure 5:
a) Examples of MENIs optimized using EGG diffusion in the macaque V4 for different neurons and different energy scales λ{1,2,5,10}. b) Mean and standard error of the normalized activations of neurons across different energy scales. c) Comparison of the MENIs activations to the top-1 most activating ImageNet images per neuron in the cross-architecture domain. Line fit obtained via Huber regression with ε=1.1. Three points at (11, 65), (9, 70), and (16, 120) are not shown for visualization purposes.
Figure 6:
Figure 6:
a) Schematic of the reconstruction paradigm. The generated image is compared to the ground truth image via L2 distance in the unit activations space. b) L2 distances in the unit activations space for the Within and Cross architecture domains comparing the EGG and GA generation methods. Shows that the EGG method generalizes better than GA across architectures. c) examples of reconstructions generated by EGG and GA in comparison to the ground truth (GT).
Figure 7:
Figure 7:
Examples of failure cases in comparison to the gradient ascent method. Text shows predicted response rate by the within-architecture validator.

References

    1. Hubel D H and Wiesel T N. Receptive fields of single neurones in the cat’s striate cortex. J. Physiol., 148(3):574–591, October 1959. - PMC - PubMed
    1. Cadieu Charles F, Hong Ha, Yamins Daniel L K, Pinto Nicolas, Ardila Diego, Solomon Ethan A, Majaj Najib J, and DiCarlo James J. Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS Comput. Biol., 10(12):e1003963, December 5014. - PMC - PubMed
    1. Yamins Daniel L K, Hong Ha, Cadieu Charles F, Solomon Ethan A, Seibert Darren, and DiCarlo James J. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc. Natl. Acad. Sci. U. S. A., 111(23):8619–8624, June 2014. - PMC - PubMed
    1. Cadena Santiago A, Denfield George H, Walker Edgar Y, Gatys Leon A, Tolias Andreas S, Bethge Matthias, and Ecker Alexander S. Deep convolutional models improve predictions of macaque V1 responses to natural images. PLoS Comput. Biol., 15(4):e1006897, April 2019. - PMC - PubMed
    1. Sinz Fabian H, Ecker Alexander S, Fahey Paul G, Walker Edgar Y, Cobos Erick, Froudarakis Emmanouil, Yatsenko Dimitri, Pitkow Xaq, Reimer Jacob, and Tolias Andreas S. Stimulus domain transfer in recurrent models for large scale cortical population prediction on video. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, pages 7199–7210, Red Hook, NY, USA, December 2018. Curran Associates Inc.

Publication types