A silent eligibility trace enables dopamine-dependent synaptic plasticity for reinforcement learning in the mouse striatum
- PMID: 29603470
- PMCID: PMC6585681
- DOI: 10.1111/ejn.13921
A silent eligibility trace enables dopamine-dependent synaptic plasticity for reinforcement learning in the mouse striatum
Abstract
Dopamine-dependent synaptic plasticity is a candidate mechanism for reinforcement learning. A silent eligibility trace - initiated by synaptic activity and transformed into synaptic strengthening by later action of dopamine - has been hypothesized to explain the retroactive effect of dopamine in reinforcing past behaviour. We tested this hypothesis by measuring time-dependent modulation of synaptic plasticity by dopamine in adult mouse striatum, using whole-cell recordings. Presynaptic activity followed by postsynaptic action potentials (pre-post) caused spike-timing-dependent long-term depression in D1-expressing neurons, but not in D2 neurons, and not if postsynaptic activity followed presynaptic activity. Subsequent experiments focused on D1 neurons. Applying a dopamine D1 receptor agonist during induction of pre-post plasticity caused long-term potentiation. This long-term potentiation was hidden by long-term depression occurring concurrently and was unmasked when long-term depression blocked an L-type calcium channel antagonist. Long-term potentiation was blocked by a Ca2+ -permeable AMPA receptor antagonist but not by an NMDA antagonist or an L-type calcium channel antagonist. Pre-post stimulation caused transient elevation of rectification - a marker for expression of Ca2+ -permeable AMPA receptors - for 2-4-s after stimulation. To test for an eligibility trace, dopamine was uncaged at specific time points before and after pre- and postsynaptic conjunction of activity. Dopamine caused potentiation selectively at synapses that were active 2-s before dopamine release, but not at earlier or later times. Our results provide direct evidence for a silent eligibility trace in the synapses of striatal neurons. This dopamine-timing-dependent plasticity may play a central role in reinforcement learning.
Keywords: dopamine; learning; reinforcement; temporal difference.
© 2018 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Conflict of interest statement
The authors declare no conflict of interest.
Figures
References
-
- Barto, A.G. , Sutton, R.S. & Brouwer, P.S. (1981) Associative search network: A reinforcement learning associative memory. Biol. Cybern., 40, 201–211.
-
- Barto, A.G. , Sutton, R.S. & Anderson, C.W. (1983) Neuronlike elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cyber., 15, 835–846.
-
- Barto, A.G. , Sutton, R.S. & Watkins, C.J.C.H. (1990). Learning and sequential decision making In Gabriel M. & Moore J.W. (Eds), Learning and Computational Neuroscience: Foundations of Adaptive Networks. MIT Press, Cambridge, MA, pp. 539–602.
-
- Black, J. , Belluzzi, J.D. & Stein, L. (1985) Reinforcement delay of one‐second severely impairs acquisition of brain self‐stimulation. Brain Res., 359, 113–119. - PubMed
-
- Bowie, D. & Mayer, M.L. (1995) Inward rectification of both AMPA and kainate subtype glutamate receptors generated by polyamine‐mediated ion channel block. Neuron, 15, 453–462. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous
