. 2022 Jan 7;12(1):141.

doi: 10.1038/s41598-021-03938-w.

Hyperrealistic neural decoding for reconstructing faces from fMRI activations via the GAN latent space

Thirza Dado¹, Yağmur Güçlütürk², Luca Ambrogioni², Gabriëlle Ras², Sander Bosch², Marcel van Gerven², Umut Güçlü²

Affiliations

¹ Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands. thirza.dado@donders.ru.nl.
² Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands.

PMID: 34997012
PMCID: PMC8741893
DOI: 10.1038/s41598-021-03938-w

Hyperrealistic neural decoding for reconstructing faces from fMRI activations via the GAN latent space

Thirza Dado et al. Sci Rep. 2022.

. 2022 Jan 7;12(1):141.

doi: 10.1038/s41598-021-03938-w.

Authors

Thirza Dado¹, Yağmur Güçlütürk², Luca Ambrogioni², Gabriëlle Ras², Sander Bosch², Marcel van Gerven², Umut Güçlü²

Affiliations

¹ Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands. thirza.dado@donders.ru.nl.
² Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands.

PMID: 34997012
PMCID: PMC8741893
DOI: 10.1038/s41598-021-03938-w

Abstract

Neural decoding can be conceptualized as the problem of mapping brain responses back to sensory stimuli via a feature space. We introduce (i) a novel experimental paradigm that uses well-controlled yet highly naturalistic stimuli with a priori known feature representations and (ii) an implementation thereof for HYPerrealistic reconstruction of PERception (HYPER) of faces from brain recordings. To this end, we embrace the use of generative adversarial networks (GANs) at the earliest step of our neural decoding pipeline by acquiring fMRI data as participants perceive face images synthesized by the generator network of a GAN. We show that the latent vectors used for generation effectively capture the same defining stimulus properties as the fMRI measurements. As such, these latents (conditioned on the GAN) are used as the in-between feature representations underlying the perceived images that can be predicted in neural decoding for (re-)generation of the originally perceived stimuli, leading to the most accurate reconstructions of perception to date.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Figure 1**
Neural coding. The mapping between sensory stimuli (left) and brain measurements (right) via a feature space (middle). Neural encoding seeks to find a transformation from stimulus to the observed brain response. Conversely, neural decoding seeks to find the information present in the observed brain responses by a mapping from brain activity back to the originally perceived stimulus.

**Figure 2**
Illustration of the HYPER pipeline. Face images are generated from randomly sampled latent vectors z by a GAN and presented as stimuli during brain scanning. A linear model predicts latent vectors $\hat{z}$ for unseen brain responses to feed back to the GAN for reconstruction.

**Figure 3**
PGGAN generator network. The architecture consists of nine blocks with a total of 23.1 M trainable parameters. It transforms 512-dimensional Gaussian latent vectors into high-resolution RGB face images ( $1024 \times 1024$ pixels).

**Figure 4**
(A) Experimental paradigm. Visual stimuli were flashed with a frequency of 3.33 Hz for 1.5 s followed by an interstimulus interval of 3 s. (B) Voxel masks. The 4096 most active voxels were selected based on the highest z-statistics within the averaged z-map from the training set responses.

**Figure 5**
Stimulus-reconstructions. The three blocks show twelve arbitrarily chosen but representative test set examples. The first column displays the face stimuli whereas the second and third column display the corresponding reconstructions from brain activations from subject 1 and 2, respectively.

**Figure 6**
Latent similarity maps. The diagonal displays the similarity between target and predicted latent vectors whereas off-diagonal entries display similarity between targets and randomly sampled latents from the same standard Gaussian distribution. The dark blue diagonal denotes that predictions always outperform random latents in terms of latent similarity.

**Figure 7**
Qualitative results. Model performance of the HYPER model compared to VAE-GAN approach and the eigenface approach. The *model* columns display the best possible results by direct encoding and decoding of the stimuli (i.e., noise ceiling; no brain data is used for these reconstructions). For HYPER, the stimuli themselves are the best possible results.

**Figure 8**
Attribute scores. Stimulus-reconstruction examples (subject 1) with rotated bar graphs denoting the attribute scores for gender, age, eyeglasses, pose and smile to visually demonstrate how this metric can be used to evaluate model performance with respect to semantic face attributes.

**Figure 9**
Attribute reconstruction performance. The correlation coefficients between observed and predicted target scores are found to be highly significant for gender, age and pose ( $p ≪ 0.05$ ; Student’s t-test), significant for eyeglasses ( $p < 0.05$ ; Student’s t-test) and not significant for smile ( $p > > 0.05$ ; Student’s t-test).

**Figure 10**
Reliability of brain recordings. The bar graphs show the mean classification accuracy with standard deviation (Y axis) for nine classifiers (X axis) that are trained on an increasing number of brain volume repetitions. The dotted line denotes chance level.

See this image and copyright information in PMC

References

1. van Gerven, M. A., Seeliger, K., Güçlü, U. & Güçlütürk, Y. Current advances in neural decoding. In Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, 379–394 (Springer, 2019).
1. Yamins DL, et al. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc. Natl. Acad. Sci. 2014;111:8619–8624. doi: 10.1073/pnas.1403112111. - DOI - PMC - PubMed
1. Khaligh-Razavi S-M, Kriegeskorte N. Deep supervised, but not unsupervised, models may explain it cortical representation. PLoS Comput. Biol. 2014;10:e1003915. doi: 10.1371/journal.pcbi.1003915. - DOI - PMC - PubMed
1. Cadieu CF, et al. Deep neural networks rival the representation of primate it cortex for core visual object recognition. PLoS Comput. Biol. 2014;10:e1003963. doi: 10.1371/journal.pcbi.1003963. - DOI - PMC - PubMed
1. Güçlü U, van Gerven MA. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J. Neurosci. 2015;35:10005–10014. doi: 10.1523/JNEUROSCI.5023-14.2015. - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Hyperrealistic neural decoding for reconstructing faces from fMRI activations via the GAN latent space

Affiliations

Hyperrealistic neural decoding for reconstructing faces from fMRI activations via the GAN latent space

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Research Materials