Updating dopamine reward signals

Wolfram Schultz¹

Affiliations

PMID: 23267662
PMCID: PMC3866681
DOI: 10.1016/j.conb.2012.11.012

Review

Updating dopamine reward signals

Wolfram Schultz. Curr Opin Neurobiol. 2013 Apr.

. 2013 Apr;23(2):229-38.

doi: 10.1016/j.conb.2012.11.012. Epub 2012 Dec 22.

Author

Wolfram Schultz¹

Affiliation

¹ Department of Physiology, Development and Neuroscience, University of Cambridge, Downing Street, Cambridge CB2 3DY, UK. ws234@cam.ac.uk

PMID: 23267662
PMCID: PMC3866681
DOI: 10.1016/j.conb.2012.11.012

Abstract

Recent work has advanced our knowledge of phasic dopamine reward prediction error signals. The error signal is bidirectional, reflects well the higher order prediction error described by temporal difference learning models, is compatible with model-free and model-based reinforcement learning, reports the subjective rather than physical reward value during temporal discounting and reflects subjective stimulus perception rather than physical stimulus aspects. Dopamine activations are primarily driven by reward, and to some extent risk, whereas punishment and salience have only limited activating effects when appropriate controls are respected. The signal is homogeneous in terms of time course but heterogeneous in many other aspects. It is essential for synaptic plasticity and a range of behavioural learning situations.

PubMed Disclaimer

Figures

**Figure 1**
Characteristics of phasic dopamine reward prediction error responses. **(a)** Neuronal coding of reward prediction error closely parallels theoretical prediction error of temporal difference (TD) model ([4^••], © National Academy of Sciences USA). **(b)** Temporal discounting of neuronal response to stimulus predicting differently delayed rewards closely parallels behavioural discounting ([15], © Society for Neuroscience). **(c)** Neuronal response depends on subjective stimulus perception ([24^••], © National Academy of Sciences USA). **(d)** Stimulus generalisation explains majority of responses to conditioned aversive stimuli. Change in sensory modality of reward predicting stimulus reduces response to unchanged aversive stimulus ([34], © Nature). **(e)** Percentages of dopamine neurons activated by reward (blue, left), motivational salience uncontrolled for stimulus or context generalisation (green) and true motivational salience (red, right). Data from [34]. **(f)** Graded coding of value prediction after initial generalisation coincides with stimulus identification by animal in dot motion task. Percentage of coherently moving dots results in graded percentage of correct performance and reward delivery ([3^•], © Society for Neuroscience).

**Figure 2**
Dopamine dependency of neuronal plasticity and behavioural learning. **(a)** Positive timing in spike time dependent plasticity protocol (STDP) results in long term potentiation (LTP) at synapses from cortical inputs to striato-nigral neurons (direct pathway) (black) and is blocked by dopamine D1 receptor antagonist SCH23390 (red) ([77^••], © Science). **(b)** Negative timing in STDP protocol results in long term depression (LTD) at cortical synapses onto striato-pallidal neurons (indirect pathway) (black) and is blocked by dopamine D2 receptor antagonist sulpiride (red) ([77^••], © Science). **(c)** T-maze learning deficit in mice with NMDA receptor knock-out in midbrain dopamine neurons impairing dopamine burst firing ([84^•], © National Academy of Sciences USA). **(d)** Separate performance deficit in mice tested in (c).

See this image and copyright information in PMC

References

1. Schultz W., Apicella P., Ljungberg T. Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J Neurosci. 1993;13:900–913. - PMC - PubMed
1. Schultz W. Predictive reward signal of dopamine neurons. J Neurophysiol. 1998;80:1–27. - PubMed
1. Nomoto K., Schultz W., Watanabe T., Sakagami M. Temporally extended dopamine responses to perceptually demanding reward-predictive stimuli. J Neurosci. 2010;30:10692–10702. - PMC - PubMed
2. Dopamine neurons show two well separated error response components to reward predicting stimuli in a random dot motion task.
1. Enomoto K., Matsumoto N., Nakai S., Satoh T., Sato T.K., Ueda Y., Inokawa H., Haruno M., Kimura M. Dopamine neurons learn to encode the long-term value of multiple future rewards. Proc Natl Acad Sci USA. 2011;108:15462–15467. - PMC - PubMed
2. The most advanced neurophysiological study to date relating dopamine responses to TD learning.
1. Bayer H.M., Glimcher P.W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron. 2005;47:129–141. - PMC - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Updating dopamine reward signals

Affiliation

Updating dopamine reward signals

Author

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources