Optogenetic Blockade of Dopamine Transients Prevents Learning Induced by Changes in Reward Features
- PMID: 29103933
- PMCID: PMC5698141
- DOI: 10.1016/j.cub.2017.09.049
Optogenetic Blockade of Dopamine Transients Prevents Learning Induced by Changes in Reward Features
Abstract
Prediction errors are critical for associative learning [1, 2]. Transient changes in dopamine neuron activity correlate with positive and negative reward prediction errors and can mimic their effects [3-15]. However, although causal studies show that dopamine transients of 1-2 s are sufficient to drive learning about reward, these studies do not address whether they are necessary (but see [11]). Further, the precise nature of this signal is not yet fully established. Although it has been equated with the cached-value error signal proposed to support model-free reinforcement learning, cached-value errors are typically confounded with errors in the prediction of reward features [16]. Here, we used optogenetic and transgenic approaches to prevent transient changes in midbrain dopamine neuron activity during the critical error-signaling period of two unblocking tasks. In one, learning was unblocked by increasing the number of rewards, a manipulation that induces errors in predicting both value and reward features. In another, learning was unblocked by switching from one to another equally valued reward, a manipulation that induces errors only in reward feature prediction. Preventing dopamine neurons in the ventral tegmental area from firing for 5 s beginning before and continuing until after the changes in reward prevented unblocking of learning in both tasks. A similar duration suppression did not induce extinction when delivered during an expected reward, indicating that it did not act independently as a negative prediction error. This result suggests that dopamine transients play a general role in error signaling rather than being restricted to only signaling errors in value.
Keywords: associative learning; blocking; dopamine; rat; reward prediction error.
Published by Elsevier Ltd.
Figures
Comment in
-
Error-Driven Learning: Dopamine Signals More Than Value-Based Errors.Curr Biol. 2017 Dec 18;27(24):R1321-R1324. doi: 10.1016/j.cub.2017.10.043. Curr Biol. 2017. PMID: 29257968
References
-
- Rescorla RA, Wagner AR. A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. In: Black AH, Prokasy WF, editors. Classical Conditioning II: Current Research and Theory. Appleton-Century-Crofts; New York: 1972. pp. 64–99.
-
- Sutton RS. Learning to predict by the method of temporal difference. Machine Learning. 1988;3:9–44.
-
- Mirenowicz J, Schultz W. Importance of unpredictability for reward responses in primate dopamine neurons. Journal of Neurophysiology. 1994;72:1024–1027. - PubMed
-
- Waelti P, Dickinson A, Schultz W. Dopamine responses comply with basic assumptions of formal learning theory. Nature. 2001;412:43–48. - PubMed
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
