Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2013 Sep;16(9):1170-8.
doi: 10.1038/nn.3495. Epub 2013 Aug 18.

Probabilistic brains: knowns and unknowns

Affiliations
Review

Probabilistic brains: knowns and unknowns

Alexandre Pouget et al. Nat Neurosci. 2013 Sep.

Abstract

There is strong behavioral and physiological evidence that the brain both represents probability distributions and performs probabilistic inference. Computational neuroscientists have started to shed light on how these probabilistic representations and computations might be implemented in neural circuits. One particularly appealing aspect of these theories is their generality: they can be used to model a wide range of tasks, from sensory processing to high-level cognition. To date, however, these theories have only been applied to very simple tasks. Here we discuss the challenges that will emerge as researchers start focusing their efforts on real-life computations, with a focus on probabilistic learning, structural learning and approximate inference.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The visuo-haptic multisensory experiment of Ernst and Banks. (a) Subjects were asked to estimate the width of a bar that they could see and touch. Subjects did not see an actual bar, but saw a set of dots floating above the background, as if glued to an otherwise invisible bar. In addition, the background dots did not all appear at the same depth, but followed a Gaussian distribution with a mean equal to the mean depth of the background. The same applied to the dots corresponding to the bar. The reliability of the visual input was controlled by the variance of the Gaussian distributions in depth. This variance varied from trial to trial and acted as a nuisance parameter. Adapted from ref. . (b) The posterior distribution over the width (p(w|wv, wt), green curve) is proportional to the product of the visual (p(wv|w), blue curve) and haptic (p(wt|w), red curve) likelihood functions. Note that the posterior distribution is shifted toward the more reliable cue (the one with the smaller variance; in this case, vision).
Figure 2
Figure 2
Probabilistic population code using a basis function decomposition of the log probability. Top left, the basis functions, for this example the log of the tuning curve, of 15 neurons to a periodic stimulus whose value varies from −180 to 180. Top right, pattern of spike counts, calculated over a 200-ms interval, across the same neuronal population in response to a stimulus whose value is 0. The spike counts were drawn from a Poisson distribution with means specified by the tuning curves. To turn spike counts into log probability, we first multiply each basis function by its corresponding spike count. Given that only three neurons are active on this trial, only three basis functions remain (center left, scaled by spike counts). The scaled basis functions are then summed to yield the log probability (up to a constant). Bottom left, the un-normalized log probability. Bottom right, the probability (properly normalized). Note that the two plots on the right (spike count versus stimulus and probability versus stimulus) represent the same probability distribution, but with a different format, just as a function can be represented directly or by its Fourier transform.
Figure 3
Figure 3
Taking a product of likelihood functions with probabilistic population codes. Bottom panels, probabilistic population codes for the two likelihoods shown in Figure 1b (the blue and red curves). Summing the two population codes (neuron by neuron) yields a population code (top) for the product of the two likelihoods (the green curve in Fig. 1b), as required for optimal multisensory integration (equation (1)).
Figure 4
Figure 4
Neural network for Chinese character identification. The input layer (bottom) corresponds to the image of a particular character. The output layer (top) represents the probability distribution over all possible Chinese characters (only four are shown for clarity). The matrices W1 and W2 specify the values of all the weights in the network; these are adjusted to optimize performance. In the probabilistic approach, these weights would be replaced by a probability distribution over weights.
Figure 5
Figure 5
Incremental structural learning. As data is observed, new units and new links are added to capture the structure of the model that best accounts for the data. Shown is the most likely graph at any point during training, and not a distribution over graphs. Indeed, computing the full posterior over graphs is often intractable, in which case one settles for the more likely set of graphs (of which we show only the most likely). (a) Dominance relations in a monkey colony. Each link represents a pair of monkeys in which one actively dominates another. (b) Animal taxonomy, in which case the graph is a tree.

Similar articles

Cited by

References

    1. Van Horn KS. Constructing a logic of plausible inference: a guide to Cox’s theorem. Int. J. Approx. Reason. 2003;34:3–24.
    1. De Finetti B, Machi A, Smith A. Theory of Probability: a Critical Introductory Treatment. New York: Wiley; 1993.
    1. Bayes T. An essay towards solving a problem in the doctrine of chances. Philos. Trans. R. Soc. Lond. 1763;53:370–418. - PubMed
    1. Laplace PS. Theorie Analytique des Probabilites. Paris: Ve Courcier; 1812.
    1. Stigler SM. Stigler’s law of eponymy. Trans. N. Y. Acad. Sci. 1980;39:147–158.

Publication types