Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014:37:289-306.
doi: 10.1146/annurev-neuro-071013-013924.

Basal ganglia circuits for reward value-guided behavior

Affiliations
Review

Basal ganglia circuits for reward value-guided behavior

Okihide Hikosaka et al. Annu Rev Neurosci. 2014.

Abstract

The basal ganglia are equipped with inhibitory and disinhibitory mechanisms that enable a subject to choose valuable objects and actions. Notably, a value can be determined flexibly by recent experience or stably by prolonged experience. Recent studies have revealed that the head and tail of the caudate nucleus selectively and differentially process flexible and stable values of visual objects. These signals are sent to the superior colliculus through different parts of the substantia nigra so that the animal looks preferentially at high-valued objects, but in different manners. Thus, relying on short-term value memories, the caudate head circuit allows the subject's gaze to move expectantly to recently valued objects. Relying on long-term value memories, the caudate tail circuit allows the subject's gaze to move automatically to previously valued objects. The basal ganglia also contain an equivalent parallel mechanism for action values. Such flexible-stable parallel mechanisms for object and action values create a highly adaptable system for decision making.

Keywords: caudate nucleus; flexible value; stable value; substantia nigra; superior colliculus; visual object.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Basal ganglia circuit controlling the initiation of saccadic eye movements
Some neurons in the monkey caudate nucleus (CD) are excited by visual inputs which originate from the cerebral cortices and other areas. The CD neurons inhibit the tonic activity of substantia nigra pars reticulata (SNr) neurons through direct connections or enhance the tonic activity of SNr neurons through indirect connections. Since the SNr-connection to the superior colliculus (SC) is inhibitory, the direct signal from the CD leads to a disinhibition of SC neurons (as illustrated in the figure on right). On the other hand, the indirect signal from the CD leads to an enhanced inhibition of SC neurons. Arrows indicate excitatory connections (or effects). Lines with rectangles indicate inhibitory connections. Solid and hatched lines indicate direct and indirect connections, respectively.
Figure 2
Figure 2. General scheme of decision making
A proper decision should be based on slowly accumulated and to-be-retained data (long-term data) as well as quickly acquired and to-be-erased data (short-term data). One interpretation, following Bayesian theory, may be that long-term data, short-term data, and decision correspond respectively to prior, likelihood, and posterior.
Figure 3
Figure 3. Experimental procedures to selectively activate the flexible and stable value mechanisms
(a) Flexible value procedure. Fractal objects change their values across blocks of trials. The values of objects are learned and tested in the same block. (b) Stable value procedure. Fractal objects stably retain their values during learning across days. The values of objects are then tested on separate days. During the test session, the objects are not associated with their stable values.
Figure 4
Figure 4. Differential encoding of flexible and stable values in subregions of the caudate nucleus
(a) Anatomy of the caudate nucleus and its subregions in the macaque monkey. (b) Average responses to the flexibly valued objects of neurons in the CDh, CDb, and CDt. Neuronal responses were averaged for the neurons” preferred values (magenta) and non-preferred values (black) using a cross-validation method. The yellow line indicates the difference between the preferred and non-preferred responses (mean ± SE). (c) Average responses to the stably valued objects in the three caudate subregions. (d) Proportions of flexible value coding neurons in the three caudate subregions. (e) Proportions of stable value coding neurons.
Figure 5
Figure 5. High capacity memory of stable values encoded by SNr neurons
(a) The locations of the CDt and SNr(p) shown on a coronal Nissl-stained section. The CDt (red) has a direct inhibitory connection to the dorsolateral SNr(p) (yellow) which then inhibits presaccadic neurons in the SC. (b-top) The responses of an SC-projecting SNr(p) neuron to 120 well learned objects (c). It was inhibited by most high-valued objects (red) and excited by most low-valued objects (blue). (b-bottom) The average responses of 151 SNr(p) neurons to high-valued objects (red) and low-valued objects (red) which were chosen randomly from about 300 well learned objects.
Figure 6
Figure 6. Parallel CD-SNr-SC circuits underlying value-based decision making
(a) Anatomical scheme. The CDh receives inputs mainly from the frontal cortical areas (Yeterian and Pandya, 1991), while the CDt receives inputs mainly from the temporal cortical areas (Saint-Cyr et al., 1990). The CDh and CDt have equivalent downstream mechanisms (as shown in Figure 4), but use different neural circuits before reaching the SC. (b) Information processing along the parallel CD-SNr-SC circuits which emulates the general scheme of decision making in Figure 1. These circuits have contrasting characteristics in terms in terms of memory and motor output mechanisms.
Figure 7
Figure 7. Differential impairments of controlled and automatic saccades by CDh and CDt inactivations
(a) Injection sites of muscimol in the caudate nucleus: CDh (top) and CDt (bottom). (b) Effects on the controlled saccades that were predictively influenced by reward feedback in the flexible value procedure (Figure 2a). The differences in the target acquisition time between high- and low-valued objects are plotted before and during inactivation (mean ± SE). Data are shown for CDh inactivation (center) and CDt inactivation (bottom). (c) Effects on the automatic saccades that occurred without reward feedback but were influenced by the stable value of the presented object (Figure 2b). The differences in the probability of automatic looking between high- and low-valued objects are plotted before and after inactivation (mean ± SE). Same format as in (b). The effects are shown only for contralateral saccades.
Figure 8
Figure 8. Hypothetical parallel mechanisms controlling behavior based on object values and actions values
The first mechanism aims at finding good objects, while the second mechanism aims at manipulating the objects. These mechanisms together enable animals and humans to gain access to rewards efficiently. For both mechanisms, the anterior part of the basal ganglia processes flexible values and guides controlled behavior, while the posterior part of the basal ganglia processes stable values and guides automatic behavior. However, they may not share the same neural circuits within the basal ganglia. Other brain structures may also constitute the mechanisms, especially those connected with the basal ganglia including the cerebral cortex (not shown).

References

    1. Ahissar M, Hochstein S. Learning pop-out detection: specificities to stimulus characteristics. Vision Res. 1996;36:3487–500. - PubMed
    1. Alexander GE, Crutcher MD. Functional architecture of basal ganglia circuits: neural substrates of parallel processing. Trends Neurosci. 1990;13:266–71. - PubMed
    1. Alexander GE, DeLong MR, Strick PL. Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu Rev Neurosci. 1986;9:357–81. - PubMed
    1. Anderson BA, Laurent PA, Yantis S. Value-driven attentional capture. Proc Natl Acad Sci U S A. 2011;108:10367–71. - PMC - PubMed
    1. Anderson BA, Yantis S. Value-driven attentional and oculomotor capture during goal-directed, unconstrained viewing. Atten Percept Psychophys. 2012;74:1644–53. - PMC - PubMed

Publication types

LinkOut - more resources