Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 7;12(1):141.
doi: 10.1038/s41598-021-03938-w.

Hyperrealistic neural decoding for reconstructing faces from fMRI activations via the GAN latent space

Affiliations

Hyperrealistic neural decoding for reconstructing faces from fMRI activations via the GAN latent space

Thirza Dado et al. Sci Rep. .

Abstract

Neural decoding can be conceptualized as the problem of mapping brain responses back to sensory stimuli via a feature space. We introduce (i) a novel experimental paradigm that uses well-controlled yet highly naturalistic stimuli with a priori known feature representations and (ii) an implementation thereof for HYPerrealistic reconstruction of PERception (HYPER) of faces from brain recordings. To this end, we embrace the use of generative adversarial networks (GANs) at the earliest step of our neural decoding pipeline by acquiring fMRI data as participants perceive face images synthesized by the generator network of a GAN. We show that the latent vectors used for generation effectively capture the same defining stimulus properties as the fMRI measurements. As such, these latents (conditioned on the GAN) are used as the in-between feature representations underlying the perceived images that can be predicted in neural decoding for (re-)generation of the originally perceived stimuli, leading to the most accurate reconstructions of perception to date.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Neural coding. The mapping between sensory stimuli (left) and brain measurements (right) via a feature space (middle). Neural encoding seeks to find a transformation from stimulus to the observed brain response. Conversely, neural decoding seeks to find the information present in the observed brain responses by a mapping from brain activity back to the originally perceived stimulus.
Figure 2
Figure 2
Illustration of the HYPER pipeline. Face images are generated from randomly sampled latent vectors z by a GAN and presented as stimuli during brain scanning. A linear model predicts latent vectors z^ for unseen brain responses to feed back to the GAN for reconstruction.
Figure 3
Figure 3
PGGAN generator network. The architecture consists of nine blocks with a total of 23.1 M trainable parameters. It transforms 512-dimensional Gaussian latent vectors into high-resolution RGB face images (1024×1024 pixels).
Figure 4
Figure 4
(A) Experimental paradigm. Visual stimuli were flashed with a frequency of 3.33 Hz for 1.5 s followed by an interstimulus interval of 3 s. (B) Voxel masks. The 4096 most active voxels were selected based on the highest z-statistics within the averaged z-map from the training set responses.
Figure 5
Figure 5
Stimulus-reconstructions. The three blocks show twelve arbitrarily chosen but representative test set examples. The first column displays the face stimuli whereas the second and third column display the corresponding reconstructions from brain activations from subject 1 and 2, respectively.
Figure 6
Figure 6
Latent similarity maps. The diagonal displays the similarity between target and predicted latent vectors whereas off-diagonal entries display similarity between targets and randomly sampled latents from the same standard Gaussian distribution. The dark blue diagonal denotes that predictions always outperform random latents in terms of latent similarity.
Figure 7
Figure 7
Qualitative results. Model performance of the HYPER model compared to VAE-GAN approach and the eigenface approach. The model columns display the best possible results by direct encoding and decoding of the stimuli (i.e., noise ceiling; no brain data is used for these reconstructions). For HYPER, the stimuli themselves are the best possible results.
Figure 8
Figure 8
Attribute scores. Stimulus-reconstruction examples (subject 1) with rotated bar graphs denoting the attribute scores for gender, age, eyeglasses, pose and smile to visually demonstrate how this metric can be used to evaluate model performance with respect to semantic face attributes.
Figure 9
Figure 9
Attribute reconstruction performance. The correlation coefficients between observed and predicted target scores are found to be highly significant for gender, age and pose (p0.05; Student’s t-test), significant for eyeglasses (p<0.05; Student’s t-test) and not significant for smile (p>>0.05; Student’s t-test).
Figure 10
Figure 10
Reliability of brain recordings. The bar graphs show the mean classification accuracy with standard deviation (Y axis) for nine classifiers (X axis) that are trained on an increasing number of brain volume repetitions. The dotted line denotes chance level.

References

    1. van Gerven, M. A., Seeliger, K., Güçlü, U. & Güçlütürk, Y. Current advances in neural decoding. In Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, 379–394 (Springer, 2019).
    1. Yamins DL, et al. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc. Natl. Acad. Sci. 2014;111:8619–8624. doi: 10.1073/pnas.1403112111. - DOI - PMC - PubMed
    1. Khaligh-Razavi S-M, Kriegeskorte N. Deep supervised, but not unsupervised, models may explain it cortical representation. PLoS Comput. Biol. 2014;10:e1003915. doi: 10.1371/journal.pcbi.1003915. - DOI - PMC - PubMed
    1. Cadieu CF, et al. Deep neural networks rival the representation of primate it cortex for core visual object recognition. PLoS Comput. Biol. 2014;10:e1003963. doi: 10.1371/journal.pcbi.1003963. - DOI - PMC - PubMed
    1. Güçlü U, van Gerven MA. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J. Neurosci. 2015;35:10005–10014. doi: 10.1523/JNEUROSCI.5023-14.2015. - DOI - PMC - PubMed