. 2017 Nov 20;27(22):3480-3486.e3.

doi: 10.1016/j.cub.2017.09.049. Epub 2017 Nov 2.

Optogenetic Blockade of Dopamine Transients Prevents Learning Induced by Changes in Reward Features

Chun Yun Chang¹, Matthew Gardner², Maria Gonzalez Di Tillio², Geoffrey Schoenbaum³

Affiliations

¹ National Institute on Drug Abuse Intramural Research Program, Cellular Neurobiology Research Branch, Behavioral Neurophysiology Research Section, 251 Bayview Boulevard, Baltimore, MD 21224, USA. Electronic address: tina.chang@nih.gov.
² National Institute on Drug Abuse Intramural Research Program, Cellular Neurobiology Research Branch, Behavioral Neurophysiology Research Section, 251 Bayview Boulevard, Baltimore, MD 21224, USA.
³ National Institute on Drug Abuse Intramural Research Program, Cellular Neurobiology Research Branch, Behavioral Neurophysiology Research Section, 251 Bayview Boulevard, Baltimore, MD 21224, USA; Department of Anatomy and Neurobiology, University of Maryland, School of Medicine, 655 W. Baltimore Street, Baltimore, MD 21201, USA; Solomon H. Snyder Department of Neuroscience, The Johns Hopkins University, 725 N. Wolfe Street, Baltimore, MD 21287, USA. Electronic address: geoffrey.schoenbaum@nih.gov.

PMID: 29103933
PMCID: PMC5698141
DOI: 10.1016/j.cub.2017.09.049

Optogenetic Blockade of Dopamine Transients Prevents Learning Induced by Changes in Reward Features

Chun Yun Chang et al. Curr Biol. 2017.

. 2017 Nov 20;27(22):3480-3486.e3.

doi: 10.1016/j.cub.2017.09.049. Epub 2017 Nov 2.

Authors

Chun Yun Chang¹, Matthew Gardner², Maria Gonzalez Di Tillio², Geoffrey Schoenbaum³

Affiliations

¹ National Institute on Drug Abuse Intramural Research Program, Cellular Neurobiology Research Branch, Behavioral Neurophysiology Research Section, 251 Bayview Boulevard, Baltimore, MD 21224, USA. Electronic address: tina.chang@nih.gov.
² National Institute on Drug Abuse Intramural Research Program, Cellular Neurobiology Research Branch, Behavioral Neurophysiology Research Section, 251 Bayview Boulevard, Baltimore, MD 21224, USA.
³ National Institute on Drug Abuse Intramural Research Program, Cellular Neurobiology Research Branch, Behavioral Neurophysiology Research Section, 251 Bayview Boulevard, Baltimore, MD 21224, USA; Department of Anatomy and Neurobiology, University of Maryland, School of Medicine, 655 W. Baltimore Street, Baltimore, MD 21201, USA; Solomon H. Snyder Department of Neuroscience, The Johns Hopkins University, 725 N. Wolfe Street, Baltimore, MD 21287, USA. Electronic address: geoffrey.schoenbaum@nih.gov.

PMID: 29103933
PMCID: PMC5698141
DOI: 10.1016/j.cub.2017.09.049

Abstract

Prediction errors are critical for associative learning [1, 2]. Transient changes in dopamine neuron activity correlate with positive and negative reward prediction errors and can mimic their effects [3-15]. However, although causal studies show that dopamine transients of 1-2 s are sufficient to drive learning about reward, these studies do not address whether they are necessary (but see [11]). Further, the precise nature of this signal is not yet fully established. Although it has been equated with the cached-value error signal proposed to support model-free reinforcement learning, cached-value errors are typically confounded with errors in the prediction of reward features [16]. Here, we used optogenetic and transgenic approaches to prevent transient changes in midbrain dopamine neuron activity during the critical error-signaling period of two unblocking tasks. In one, learning was unblocked by increasing the number of rewards, a manipulation that induces errors in predicting both value and reward features. In another, learning was unblocked by switching from one to another equally valued reward, a manipulation that induces errors only in reward feature prediction. Preventing dopamine neurons in the ventral tegmental area from firing for 5 s beginning before and continuing until after the changes in reward prevented unblocking of learning in both tasks. A similar duration suppression did not induce extinction when delivered during an expected reward, indicating that it did not act independently as a negative prediction error. This result suggests that dopamine transients play a general role in error signaling rather than being restricted to only signaling errors in value.

Keywords: associative learning; blocking; dopamine; rat; reward prediction error.

Published by Elsevier Ltd.

PubMed Disclaimer

Figures

**Figure 1. Histological verification, task designs, and pellet preference test**
A) Fiber implants were localized in the vicinity of NpHR expression in VTA. The light orange shading represents the maximal spread of expression at each level, whereas the dark orange shading represents the minimal spread. B) Expression of NpHR showed a high degree of colocalization (~90%) with TH in VTA neurons. Green represents NpHR-eYFP, red represents TH. Scale is 500 μm. C) Design of number (top) and identity (bottom) unblocking tasks. All rats were trained in both tasks; order of training was counterbalanced. D) Preference test comparing consumption of banana and chocolate pellets used in identity unblocking task. During the test, the rats were given access to both banana and chocolate pellets (200 pellets each). The number of remaining pellets were assessed every 2.5 min, 5 min, and 10 min as the test progressed. There was no discernable difference in the consumption rate between the two flavors during the course of 60 min test (p >0.32).

**Figure 2. Optogenetic blockade of dopamine transients prevents learning induced by changes in reward number**
Design is illustrated in Figure 1c. Conditioned responding is shown to VB and VUB during conditioning and reconditioning (left), to VB/AB and VUB/AUB during compound training (middle), and to AB and AUB during the probe test (right). Conditioned responding is represented as the percentage of time the rats spent in the food cup during cue presentation. Top panels for compound training and probe test show data from the experimental run (Exp), when neurons were suppressed during delivery of the second pellet, and bottom panels show data from the ITI run (ITI), when neurons were suppressed during the intertrial interval. Insets show the percentage of time rats spent in the food cup during the reward period after termination of the cues. (See also Figure S1)

**Figure 3. Optogenetic blockade of dopamine transients prevents learning induced by changes in reward flavor**
Design is illustrated in Figure 1c. Conditioned responding is shown to VB and VUB during conditioning and reconditioning (left), to VB/AB and VUB/AUB during compound training (middle), and to AB and AUB during probe test (right). Conditioned responding is represented as the percentage of time the rats spent in the food cup during cue presentation. Top panels for compound training and probe test show data from the experimental run (Exp), when neurons were suppressed during delivery of the second pellet, and bottom panels show data from the ITI run (ITI), when neurons were suppressed during the intertrial interval. Insets show the percentage of time rats spent in the food cup during the reward period after termination of the cues.

See this image and copyright information in PMC

Comment in

Error-Driven Learning: Dopamine Signals More Than Value-Based Errors.
Keiflin R, Janak PH. Keiflin R, et al. Curr Biol. 2017 Dec 18;27(24):R1321-R1324. doi: 10.1016/j.cub.2017.10.043. Curr Biol. 2017. PMID: 29257968

References

1. Rescorla RA, Wagner AR. A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. In: Black AH, Prokasy WF, editors. Classical Conditioning II: Current Research and Theory. Appleton-Century-Crofts; New York: 1972. pp. 64–99.
1. Sutton RS. Learning to predict by the method of temporal difference. Machine Learning. 1988;3:9–44.
1. Mirenowicz J, Schultz W. Importance of unpredictability for reward responses in primate dopamine neurons. Journal of Neurophysiology. 1994;72:1024–1027. - PubMed
1. Roesch MR, Calu DJ, Schoenbaum G. Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nature Neuroscience. 2007;10:1615–1624. - PMC - PubMed
1. Waelti P, Dickinson A, Schultz W. Dopamine responses comply with basic assumptions of formal learning theory. Nature. 2001;412:43–48. - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect - Access expert opinions and insights on biomedical research.
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Optogenetic Blockade of Dopamine Transients Prevents Learning Induced by Changes in Reward Features

Affiliations

Optogenetic Blockade of Dopamine Transients Prevents Learning Induced by Changes in Reward Features

Authors

Affiliations

Abstract

Figures

Comment in

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources