. 2019 Jan 7;29(1):93-103.e3.

doi: 10.1016/j.cub.2018.11.050. Epub 2018 Dec 20.

Ventral Tegmental Dopamine Neurons Participate in Reward Identity Predictions

Ronald Keiflin¹, Heather J Pribut², Nisha B Shah², Patricia H Janak³

Affiliations

¹ Department of Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD 21218, USA. Electronic address: rkeiflin@ucsb.edu.
² Department of Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD 21218, USA.
³ Department of Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD 21218, USA; The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Johns Hopkins University, Baltimore, MD 21205, USA; Kavli Neuroscience Discovery Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA. Electronic address: patricia.janak@jhu.edu.

PMID: 30581025
PMCID: PMC6324975
DOI: 10.1016/j.cub.2018.11.050

Ventral Tegmental Dopamine Neurons Participate in Reward Identity Predictions

Ronald Keiflin et al. Curr Biol. 2019.

. 2019 Jan 7;29(1):93-103.e3.

doi: 10.1016/j.cub.2018.11.050. Epub 2018 Dec 20.

Authors

Ronald Keiflin¹, Heather J Pribut², Nisha B Shah², Patricia H Janak³

Affiliations

¹ Department of Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD 21218, USA. Electronic address: rkeiflin@ucsb.edu.
² Department of Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD 21218, USA.
³ Department of Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD 21218, USA; The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Johns Hopkins University, Baltimore, MD 21205, USA; Kavli Neuroscience Discovery Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA. Electronic address: patricia.janak@jhu.edu.

PMID: 30581025
PMCID: PMC6324975
DOI: 10.1016/j.cub.2018.11.050

Abstract

Dopamine (DA) neurons in the ventral tegmental area (VTA) and substantia nigra (SNc) encode reward prediction errors (RPEs) and are proposed to mediate error-driven learning. However, the learning strategy engaged by DA-RPEs remains controversial. RPEs might imbue predictive cues with pure value, independently of representations of their associated outcome. Alternatively, RPEs might promote learning about the sensory features (the identity) of the rewarding outcome. Here, we show that, although both VTA and SNc DA neuron activation reinforces instrumental responding, only VTA DA neuron activation during consumption of expected sucrose reward restores error-driven learning and promotes formation of a new cue→sucrose association. Critically, expression of VTA DA-dependent Pavlovian associations is abolished following sucrose devaluation, a signature of identity-based learning. These findings reveal that activation of VTA- or SNc-DA neurons engages largely dissociable learning processes with VTA-DA neurons capable of participating in outcome-specific predictive learning, and the role of SNc-DA neurons appears limited to reinforcement of instrumental responses.

Keywords: blocking; conditioning; dopamine; learning; model-based; model-free; optogenetics; reward-prediction error; substantia nigra; ventral tegmental area.

PubMed Disclaimer

Conflict of interest statement

DECLARATION OF INTERESTS

The authors declare no competing interests.

Figures

**Figure 1.. Behavioral task and histology**
**(A)** Three groups of rats were trained in the blocking/unblocking task. During the *Individual Cue* phase, two visual cues (A and B) were paired with sucrose reward. In the *Compound Cue* phase, two new trial types of simultaneous presentation of a visual cue with an auditory cue (X or Y), resulting in two compound stimuli (AX and BY) were introduced. The absence of RPE during compound AX is predicted to block learning about cue X. During compound BY, an RPE was produced by increasing reward magnitude (Reward Upshift group) or by photostimulating DA neurons during sucrose consumption (VTA-DA Stim. and SNc-DA Stim. groups). A 1-day probe test assessed the associative strength acquired by each individual cue. **(B)** Reconstruction of ChR2-YFP expression and fiber placement in VTA (left) and SNc (right). Light and dark shading indicate maximal and minimal spread of ChR2-YFP, respectively. Square symbols mark ventral extremity of fiber implants. **(C)** Representative ChR2-YFP expression in VTA (left) or SNc (right). **(D)** Laser power from the fiber tip estimated from [31]. Full laser power = 120 mW/mm² (corresponds to 34mW at the tip of 300um fibers; http://www.optogenetics.org/calc)

**Figure 2.. Performance during Individual Cue and Compound Cue training.**
**(A-C)** Time spent in reward port during cue presentation over 10 days of Individual Cue conditioning and 4 days of Compound Cue conditioning for Reward Upshift **(A)** VTA-DA stimulation **(B)** and SNc-DA stimulation **(C)** groups. Values include only the first 9-s after cue onset and prior to sucrose delivery to avoid contamination with the consumption period. Inserts depict average performance over 4 days of Compound Cue conditioning. For all groups, introduction of the auditory stimulus increased performance (A *vs.* AX, and B *vs.*BY, all Ps<0.001, Bonferroni-corrected paired t-tests), but there was no difference in responding between the compound cues (AX *vs.* BY, Ps>0.967, Bonferroni-corrected paired t-tests). **(D-F)** Probability of presence in port throughout cue presentation during last 4 days of Individual Cue (upper graphs) and 4 days of Compound Cue conditioning (lower graphs), for Reward Upshift **(D)**, VTA-DA stimulation **(E)**, and SNc-DA stimulation **(F)** groups. Note that photostimulation during compound cue BY did not disrupt ongoing behavior. See also Figure S1.

**Figure 3.. Photoactivation of VTA-DA but not SNc-DA neurons mimics endogenous RPEs and unblocks learning.**
Conditioned responding was measured by time spent in the reward port during cue presentation. **(A-C)**: Whole session performance in Reward Upshift **(A)**, VTA-DA stimulation **(B)**, and SNc-DA stimulation **(C)** groups. Scatterplot inserts show individual data distributions for responding to A and B (top inserts) and X and Y (bottom insert). Histograms along the diagonal are frequency distributions (subject counts) for the difference scores (A - B, or X - Y); off-centered distributions reveal higher responding to one of the cues. **(D-F)**. Trial-by-trial test performance after Reward Upshift **(D)**, VTA-DA stimulation **(E)**, and SNc-DA stimulation **(F)**. A 3-way mixed ANOVA (Group x Cue x Trial) analyzed the evolution of responding over the session and found an interaction between all factors (F_30,855=2.603, P<0.001, after Greenhouse-Geisser correction). **(G-I)** Second-by-second tracking of presence in port during first presentation of each cue (A, B: upper graph; X, Y: lower graph) for Reward Upshift **(G)**, VTA-DA stimulation **(H)**, and SNc-DA stimulation **(I)** groups. *P<0.05 (A *vs.* B, or X *vs.* Y; Post-hoc Bonferroni-corrected t-test). Error bars = s.e.m. See also Figures S1-S3

**Figure 4.. Photoactivation of VTA-DA or SNc-DA neurons serves as an equally potent reinforcer of ICSS behavior.**
**(A)** Rats could respond on one of two nosepokes to obtain optical stimulation of VTA- or SNc-DA neurons. **(B)** Responses at active and inactive nosepokes during daily 1-h sessions. **(C)** Cumulative active nosepoke responses during the last ICSS session. *P<0.05, Active vs. Inactive Nosepoke; ^#P<0.05, Session 1 vs. Session 2 (active nosepoke). Error bar and error bands = s.e.m.

**Figure 5.. Devaluation of the sucrose outcome abolishes conditioned responding to the unblocked cue Y in Reward Upshift and VTA-DA groups.**
Learning about target cue Y was unblocked by reward upshift (top graphs) or activation of VTA-DA neurons (bottom graphs). Following unblocking, sucrose was devalued for half of the subjects in Reward Upshift and VTA-DA groups by pairing sucrose consumption with LiCl (Devalued condition). The remaining subjects were exposed to sucrose or LiCl-induced illness on alternate days, preserving the value of sucrose (Valued condition). Conditioned responding to Y (unblocked cue) and A (cue paired with large reward) was assessed at Test. **(A, B)** Time spent in reward port during cue presentation in Reward Upshift **(A)** and VTA-DA **(B)** groups. Sucrose devaluation reduced responding to Y in both groups. Insets represent inter trial interval (ITI) responding outside cue presentation. **(C, D)** Trial-by-trial performance in Reward Upshift **(C)** and VTA-DA stimulation **(D)** groups. 3-way ANOVAs (Cue x Devaluation x Trial) found an interaction between these factors for VTA-DA (F_2,20=3.901, P=0.037) but not Reward Upshift (F_2,21=1.276, P=0.300) subjects. **(E, F)** Second-by-second tracking of presence in port during first presentation of each cue. *P<0.05 (Valued vs. Devalued; Bonferroni-corrected t-test). Error bar and error bands = s.e.m. See also Figures S4-S5.

See this image and copyright information in PMC

Cited by

Dynamics of Lateral Habenula-Ventral Tegmental Area Microcircuit on Pain-Related Cognitive Dysfunctions.
Pereira AR, Alemi M, Cerqueira-Nunes M, Monteiro C, Galhardo V, Cardoso-Cruz H. Pereira AR, et al. Neurol Int. 2023 Oct 27;15(4):1303-1319. doi: 10.3390/neurolint15040082. Neurol Int. 2023. PMID: 37987455 Free PMC article. Review.
Distinct temporal difference error signals in dopamine axons in three regions of the striatum in a decision-making task.
Tsutsui-Kimura I, Matsumoto H, Akiti K, Yamada MM, Uchida N, Watabe-Uchida M. Tsutsui-Kimura I, et al. Elife. 2020 Dec 21;9:e62390. doi: 10.7554/eLife.62390. Elife. 2020. PMID: 33345774 Free PMC article.
Neuronal activity in the ventral tegmental area during goal-directed navigation recorded by low-curvature microelectrode arrays.
Xu W, Wang M, Yang G, Mo F, Liu Y, Shan J, Jing L, Li M, Liu J, Lv S, Duan Y, Han M, Xu Z, Song Y, Cai X. Xu W, et al. Microsyst Nanoeng. 2024 Oct 14;10(1):145. doi: 10.1038/s41378-024-00778-2. Microsyst Nanoeng. 2024. PMID: 39396959 Free PMC article.
Dopamine D2 receptor signaling on iMSNs is required for initiation and vigor of learned actions.
Augustin SM, Loewinger GC, O'Neal TJ, Kravitz AV, Lovinger DM. Augustin SM, et al. Neuropsychopharmacology. 2020 Nov;45(12):2087-2097. doi: 10.1038/s41386-020-00799-1. Epub 2020 Aug 18. Neuropsychopharmacology. 2020. PMID: 32811899 Free PMC article.
Dopamine signals as temporal difference errors: recent advances.
Starkweather CK, Uchida N. Starkweather CK, et al. Curr Opin Neurobiol. 2021 Apr;67:95-105. doi: 10.1016/j.conb.2020.08.014. Epub 2020 Nov 10. Curr Opin Neurobiol. 2021. PMID: 33186815 Free PMC article. Review.

See all "Cited by" articles

References

1. Eshel N, Tian J, Bukwich M, and Uchida N (2016). Dopamine neurons share common response function for reward prediction error. Nat Neurosci 19, 479–486. - PMC - PubMed
1. Schultz W, Dayan P, and Montague PR (1997). A neural substrate of prediction and reward. Science 275, 1593–1599. - PubMed
1. Waelti P, Dickinson A, and Schultz W (2001). Dopamine responses comply with basic assumptions of formal learning theory. Nature 412, 43–48. - PubMed
1. Glimcher PW (2011). Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis. Proc Natl Acad Sci U S A 108 Suppl 3, 15647–15654. - PMC - PubMed
1. Rescorla RA, and Wagner AR (1972). A theory of pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement In Classical conditioning II: current research and theory, Black AH and Prokasy WF, eds. (New York: Appleton-Century-Crofts; ), pp. 64–99.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

R01 DA035943/DA/NIDA NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect - Access expert opinions and insights on biomedical research.

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Ventral Tegmental Dopamine Neurons Participate in Reward Identity Predictions

Affiliations

Ventral Tegmental Dopamine Neurons Participate in Reward Identity Predictions

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources