Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 13;28(6):112663.
doi: 10.1016/j.isci.2025.112663. eCollection 2025 Jun 20.

Neural mechanisms in resolving prior and likelihood uncertainty in scene recognition

Affiliations

Neural mechanisms in resolving prior and likelihood uncertainty in scene recognition

Kojiro Hayashi et al. iScience. .

Abstract

Recognizing real-world scenes requires integrating sensory (likelihood) and prior information, yet how the brain represents these components remains unclear. To investigate this, we employed deep image transformation to generate images with parametrically controlled naturalness, enabling precise manipulation of likelihood uncertainty. Concurrently, we designed a sequential image-scene recognition task that quantitatively modulates prior information. By combining these AI-generated images with the task, we conducted a functional magnetic resonance imaging (fMRI) experiment enabling systematic control of both likelihood and prior information. The results revealed that higher visual areas were activated when viewing images with low likelihood uncertainty. In contrast, the default mode network, which includes the medial prefrontal gyrus, inferior parietal lobule, and middle temporal gyrus, exhibited higher activation when more prior information was available. This approach highlights how applying AI technology to neuroscience questions can enhance our understanding of neural mechanisms underlying scene recognition.

Keywords: Cognitive neuroscience; Neuroscience; Sensory neuroscience.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
Experimental procedure (A) Architecture of image generation. One set of images covered “natural,” “mixed,” and “unnatural” images with different levels of naturalness (expressed as a scalar α), artificially generated by a deep learning technology (GANSID). Our image-generation method includes an encoder, generator, and map generator. The generator, trained with the latent variable Zϕ from the input image and the map latent variable Zθ, recreates the original image. The map generator was pre-trained using Itti’s saliency map. These images are produced using Zθ and a mix of Zϕ and a random normal variable ε at ratio α. By varying α from 0 to 1, images with different naturalness were generated. The natural and unnatural images were characterized by α = 1.0 and α = 0.0, respectively, and the mixed images by in-between α values, α = 0.2, 0.4, 0.6, and 0.8. (B) Procedure for a single block consisting of six trials. The participants were first instructed on the block condition (Binary or Gradual). In a Binary block, six images including natural (α = 1.0) and unnatural (α = 0.0) images, which were generated from different original images, were presented in random order (not shown in the figure). In a Gradual block, a set of six images generated from one original image were presented in two different orders; in the N-U condition, the images were displayed in the order of decreasing naturalness, i.e., from natural (α = 1.0, N) to unnatural (α = 0.0, U), whereas in the U-N condition, they were displayed in the order of increasing naturalness, i.e., from unnatural to natural. In each trial, participants were presented one image (for 4,000 ms) and were required to report whether they could recognize the image scene or not by selecting a green (corresponding to “Yes,” that is, recognizable) or red square (corresponding to “No,” or un-recognizable) using an MRI-compatible response box (within 1,500 ms).
Figure 2
Figure 2
Behavioral results and model of participants’ scene recognition (A) Participant-wise image scene recognition rate as a function of the naturalness level of the image under the N-U (blue) and U-N conditions (orange) (N = 30). The circles and solid lines represent the averages of the actual data and model predictions (see (C)) across the participants, respectively. The error bars and shaded areas represent the standard error. For each naturalness level, the recognition rate was statistically compared between the N-U and U-N conditions (Wilcoxon signed-rank test; ∗∗p < 0.01, ∗∗∗p < 0.001, Bonferroni-corrected). (B) The proportion of the trials categorized by four different response patterns in the previous and current trials (No→No, No→Yes, Yes→No, and Yes→Yes). For example, No→Yes indicates the trials in which the participants evaluated the image in the current trial as recognizable and that in the previous trial as un-recognizable, in other words, they switched their response. The circles and solid lines represent the means of the participants and model predictions (see (C)), respectively. The error bars and shaded areas represent the standard error. (C) A Bayesian model of participant scene recognition. This model assumes that when the participants observed the image with the naturalness level α, the probability that they recognized the scene of the image followed a normal distribution N(α,σ). For each participant, threshold C was estimated to be common to all alpha values, with the area to the right side of the threshold defined as P(α|Yes). P(α|Yes) denotes the probability or likelihood of recognition of the scene provided by the current image only. Finally, the posterior probability P(Yes|α) that the participant responded “Yes” to the current image was updated according to Bayesian formula. (D) The mean squared error (MSE) between the fixation density maps (FDMs) with the natural images (α = 1.0) and those for images with the other naturalness-level (α) images in the N-U (blue) and U-N conditions (orange) (N = 20). The circles and solid lines represent the averages across all participants for the GANSID images, while the diamonds and dashed lines represent the means for the scrambled images. The error bars indicate the standard error (Wilcoxon signed-rank test; ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001, Bonferroni-corrected). (E) The MSE of the FDMs for each naturalness-level image between the N-U and U-N conditions. The format is the same as (D).
Figure 3
Figure 3
Brain activity induced by the image naturalness and the image presentation timing Brain activation areas reflecting the main effect of naturalness (A, red) and unnaturalness (A, yellow) and whether the image was presented in the first (B, red) or the last trial (B, yellow) in the block were examined using two-way ANOVA (voxel level: p < 0.0001, uncorrected, cluster level: p < 0.05, FWE-corrected). No statistically significant interaction effects were observed. The line plots show the percent signal change (PSC) in each anatomical ROI (see STAR Methods) as a function of the naturalness level α of the presented images in the N-U (blue) and U-N conditions (orange), respectively. Solid lines indicate the average values across the participants. The shaded area represents the standard error.
Figure 4
Figure 4
Model-based analysis of fMRI (A) ROI-wise correlation coefficients between the percent signal change at each anatomical ROI (see STAR Methods) and the joint probability (predicted by the Bayesian model, Figure 2C) of two types of behaviors, “Yes” (recognizable) and “No” (un-recognizable), for pairs of previously and currently presented images. Blue and orange represent the N-U and U-N conditions, respectively. Circles indicate the average values across the participants. Error bars indicate the standard error. (B) ROI-wise correlation coefficients between the percentage signal change at each ROI and each of the three model-based indicators: “likelihood” (red), “prior” (yellow), and “surprise” (green). Error bars indicate the standard error of the mean. (C) Results of the parametric modulation analysis (voxel level: p < 0.0001, uncorrected; cluster level: p < 0.05, FWE-corrected). Brain regions correlated with each of the three model-based indicators: “likelihood” (red), “prior” (yellow, Voxel-level: p < 0.001, uncorrected, minimum 100 voxels), and “surprise” (green).

References

    1. Doya K., Ishii S., Pouget A., Rao R.P.N. The MIT Press; 2006. Bayesian Brain: Probabilistic Approaches to Neural Coding. - DOI
    1. Knill D.C., Pouget A. The Bayesian brain: the role of uncertainty in neural coding and computation. Trends Neurosci. 2004;27:712–719. doi: 10.1016/j.tins.2004.10.007. - DOI - PubMed
    1. Kawato M., Hayakawa H., Inui T. A forward-inverse optics model of reciprocal connections between visual cortical areas. Network Comput. Neural Syst. 1993;4:415–422. doi: 10.1088/0954-898X_4_4_001. - DOI
    1. Rao R.P., Ballard D.H. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat. Neurosci. 1999;2:79–87. doi: 10.1038/4580. - DOI - PubMed
    1. Beck J.M., Ma W.J., Kiani R., Hanks T., Churchland A.K., Roitman J., Shadlen M.N., Latham P.E., Pouget A. Probabilistic Population Codes for Bayesian Decision Making. Neuron. 2008;60:1142–1152. doi: 10.1016/j.neuron.2008.09.021. - DOI - PMC - PubMed