Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jan 15;34(3):698-704.
doi: 10.1523/JNEUROSCI.2489-13.2014.

Phasic dopamine release in the rat nucleus accumbens symmetrically encodes a reward prediction error term

Affiliations

Phasic dopamine release in the rat nucleus accumbens symmetrically encodes a reward prediction error term

Andrew S Hart et al. J Neurosci. .

Abstract

Making predictions about the rewards associated with environmental stimuli and updating those predictions through feedback is an essential aspect of adaptive behavior. Theorists have argued that dopamine encodes a reward prediction error (RPE) signal that is used in such a reinforcement learning process. Recent work with fMRI has demonstrated that the BOLD signal in dopaminergic target areas meets both necessary and sufficient conditions of an axiomatic model of the RPE hypothesis. However, there has been no direct evidence that dopamine release itself also meets necessary and sufficient criteria for encoding an RPE signal. Further, the fact that dopamine neurons have low tonic firing rates that yield a limited dynamic range for encoding negative RPEs has led to significant debate about whether positive and negative prediction errors are encoded on a similar scale. To address both of these issues, we used fast-scan cyclic voltammetry to measure reward-evoked dopamine release at carbon fiber electrodes chronically implanted in the nucleus accumbens core of rats trained on a probabilistic decision-making task. We demonstrate that dopamine concentrations transmit a bidirectional RPE signal with symmetrical encoding of positive and negative RPEs. Our findings strengthen the case that changes in dopamine concentration alone are sufficient to encode the full range of RPEs necessary for reinforcement learning.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
a, Coronal sections showing the locations of chronically implanted electrodes. Brain atlas sections are from Paxinos and Watson (2005). b, The trial structure was the same in both deterministic and probabilistic tasks. A session contained 160 trials in 20 blocks, with four forced-choice (two on each lever) followed by four free-choice trials. In the probabilistic task, lever presses on the 75% lever resulted in four 20 mg food pellets on 75% of trials and one pellet on 25% of trials. Probabilities were reversed for the 25% lever. In the deterministic task, the two levers guaranteed four pellets or one pellet, respectively.
Figure 2.
Figure 2.
a, Top, Mean ± SD dopamine response to an unsignaled food pellet reward delivered before the beginning of a behavioral session (n = 58, 6 electrodes). Bottom, Latency to maximum dopamine signal for each unsignaled reward presentation. b, Top, Reward-evoked changes in dopamine concentration recorded at one electrode within a single behavioral session for the four lottery-outcome combinations in the probabilistic task. For the purpose of illustration, traces were smoothed with 3-point running average. Middle and bottom, Average reward-evoked dopamine concentration for n = 6 electrodes for each of the lottery-outcome combinations on forced trials during the probabilistic (upper) and deterministic (lower) tasks. Shading indicates the epoch used in subsequent analyses.
Figure 3.
Figure 3.
a, Mean dopamine release to four pellets is plotted against mean dopamine release to one pellet after both 75% and 25% lottery forced-choice trials for each electrode. Points above the line indicate greater dopamine release to four pellets than to one pellet. b, Mean dopamine release on 25% lottery trials is plotted against mean dopamine release on 75% lottery trials for both prizes for each electrode. Points above the line indicate greater dopamine concentrations when pellets are received from the 25% lottery than from the 75% lottery. c, Mean dopamine release to four pellets is plotted against mean dopamine to one pellet on the deterministic task for each electrode. Responses are heterogeneously distributed around the equivalence line. d, The mean dopamine release from ac for n = 6 electrodes are shown for both the probabilistic and deterministic tasks. Ordering of signals satisfies the axiomatic RPE model. e, Mean ± SEM dopamine release for each possible lottery-prize combination across the two tasks is plotted against the predicted RPE, calculated as the difference between reward magnitude and average lottery outcome. The line indicates a significant linear relationship. *p < 0.05 for comparison between lotteries; **p < 0.01 for comparison between prizes, paired t test.
Figure 4.
Figure 4.
Epoch analysis for all 118 possible 0.5, 1.1, and 1.9 s windows between 0.1 and 5 s after reward onset. a, Colors indicate the number of lottery/electrode combinations for which the dopamine signal to four pellets was greater than the dopamine signal for one pellet for each time window. Counts >9 are consistent with Axiom 1. b, Colors indicate the number of prize/electrode combinations for which the dopamine signal to 25% lottery outcomes was greater than the dopamine signal to 75% lottery outcomes. Counts >9 are consistent with Axiom 2. Dashed lines indicate the set of time windows for which the corrected conjunction p-value for t tests of Axioms 1 and 2 is <0.05. The solid line indicates the time window analyzed in Figure 3.

Comment in

References

    1. Bayer HM, Glimcher PW. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron. 2005;47:129–141. doi: 10.1016/j.neuron.2005.05.020. - DOI - PMC - PubMed
    1. Bayer HM, Lau B, Glimcher PW. Statistics of midbrain dopamine neuron spike trains in the awake primate. J Neurophysiol. 2007;98:1428–1439. doi: 10.1152/jn.01140.2006. - DOI - PubMed
    1. Caplin A, Dean M. Dopamine, reward prediction error, and economics. Quarterly Journal of Economics. 2007;123:663–701.
    1. Caplin A, Dean M, Glimcher PW, Rutledge RB. Measuring beliefs and rewards: a neuroeconomic approach. Quarterly Journal of Economics. 2010;125:923–960. doi: 10.1162/qjec.2010.125.3.923. - DOI - PMC - PubMed
    1. Clark JJ, Sandberg SG, Wanat MJ, Gan JO, Horne EA, Hart AS, Akers CA, Parker JG, Willuhn I, Martinez V, Evans SB, Stella N, Phillips PEM. Chronic microsensors for longitudinal, subsecond dopamine detection in behaving animals. Nat Methods. 2010;7:126–129. doi: 10.1038/nmeth.1412. - DOI - PMC - PubMed

Publication types