Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2013 Apr;23(2):229-38.
doi: 10.1016/j.conb.2012.11.012. Epub 2012 Dec 22.

Updating dopamine reward signals

Affiliations
Review

Updating dopamine reward signals

Wolfram Schultz. Curr Opin Neurobiol. 2013 Apr.

Abstract

Recent work has advanced our knowledge of phasic dopamine reward prediction error signals. The error signal is bidirectional, reflects well the higher order prediction error described by temporal difference learning models, is compatible with model-free and model-based reinforcement learning, reports the subjective rather than physical reward value during temporal discounting and reflects subjective stimulus perception rather than physical stimulus aspects. Dopamine activations are primarily driven by reward, and to some extent risk, whereas punishment and salience have only limited activating effects when appropriate controls are respected. The signal is homogeneous in terms of time course but heterogeneous in many other aspects. It is essential for synaptic plasticity and a range of behavioural learning situations.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Characteristics of phasic dopamine reward prediction error responses. (a) Neuronal coding of reward prediction error closely parallels theoretical prediction error of temporal difference (TD) model ([4••], © National Academy of Sciences USA). (b) Temporal discounting of neuronal response to stimulus predicting differently delayed rewards closely parallels behavioural discounting ([15], © Society for Neuroscience). (c) Neuronal response depends on subjective stimulus perception ([24••], © National Academy of Sciences USA). (d) Stimulus generalisation explains majority of responses to conditioned aversive stimuli. Change in sensory modality of reward predicting stimulus reduces response to unchanged aversive stimulus ([34], © Nature). (e) Percentages of dopamine neurons activated by reward (blue, left), motivational salience uncontrolled for stimulus or context generalisation (green) and true motivational salience (red, right). Data from [34]. (f) Graded coding of value prediction after initial generalisation coincides with stimulus identification by animal in dot motion task. Percentage of coherently moving dots results in graded percentage of correct performance and reward delivery ([3], © Society for Neuroscience).
Figure 2
Figure 2
Dopamine dependency of neuronal plasticity and behavioural learning. (a) Positive timing in spike time dependent plasticity protocol (STDP) results in long term potentiation (LTP) at synapses from cortical inputs to striato-nigral neurons (direct pathway) (black) and is blocked by dopamine D1 receptor antagonist SCH23390 (red) ([77••], © Science). (b) Negative timing in STDP protocol results in long term depression (LTD) at cortical synapses onto striato-pallidal neurons (indirect pathway) (black) and is blocked by dopamine D2 receptor antagonist sulpiride (red) ([77••], © Science). (c) T-maze learning deficit in mice with NMDA receptor knock-out in midbrain dopamine neurons impairing dopamine burst firing ([84], © National Academy of Sciences USA). (d) Separate performance deficit in mice tested in (c).

References

    1. Schultz W., Apicella P., Ljungberg T. Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J Neurosci. 1993;13:900–913. - PMC - PubMed
    1. Schultz W. Predictive reward signal of dopamine neurons. J Neurophysiol. 1998;80:1–27. - PubMed
    1. Nomoto K., Schultz W., Watanabe T., Sakagami M. Temporally extended dopamine responses to perceptually demanding reward-predictive stimuli. J Neurosci. 2010;30:10692–10702. - PMC - PubMed
    2. Dopamine neurons show two well separated error response components to reward predicting stimuli in a random dot motion task.

    1. Enomoto K., Matsumoto N., Nakai S., Satoh T., Sato T.K., Ueda Y., Inokawa H., Haruno M., Kimura M. Dopamine neurons learn to encode the long-term value of multiple future rewards. Proc Natl Acad Sci USA. 2011;108:15462–15467. - PMC - PubMed
    2. The most advanced neurophysiological study to date relating dopamine responses to TD learning.

    1. Bayer H.M., Glimcher P.W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron. 2005;47:129–141. - PMC - PubMed

Publication types