Dynamical model of salience gated working memory, action selection and reinforcement based on basal ganglia and dopamine feedback
- PMID: 18280108
- DOI: 10.1016/j.neunet.2007.12.040
Dynamical model of salience gated working memory, action selection and reinforcement based on basal ganglia and dopamine feedback
Erratum in
- Neural Netw. 2008 May;21(4):698
Abstract
A simple working memory model based on recurrent network activation is proposed and its application to selection and reinforcement of an action is demonstrated as a solution to the temporal credit assignment problem. Reactivation of recent salient cue states is generated and maintained as a type of salience gated recurrently active working memory, while lower salience distractors are ignored. Cue reactivation during the action selection period allows the cue to select an action while its reactivation at the reward period allows the reinforcement of the action selected by the reactivated state, which is necessarily the action which led to the reward being found. A down-gating of the external input during the reactivation and maintenance prevents interference. A double winner-take-all system which selects only one cue and only one action allows the targeting of the cue-action allocation to be modified. This targeting works both to reinforce a correct cue-action allocation and to punish the allocation when cue-action allocations change. Here we suggest a firing rate neural network implementation of this system based on the basal ganglia anatomy with input from a cortical association layer where reactivations are generated by signals from the thalamus. Striatum medium spiny neurons represent actions. Auto-catalytic feedback from a dopamine reward signal modulates three-way Hebbian long term potentiation and depression at the cortical-striatal synapses which represent the cue-action associations. The model is illustrated by the numerical simulations of a simple example--that of associating a cue signal to a correct action to obtain reward after a delay period, typical of primate cue reward tasks. Through learning, the model shows a transition from an exploratory phase where actions are generated randomly, to a stable directed phase where the animal always chooses the correct action for each experienced state. When cue-action allocations change, we show that this is noticed by the model, the incorrect cue-action allocations are punished and the correct ones discovered.
Similar articles
-
Functional properties of the basal ganglia's re-entrant loop architecture: selection and reinforcement.Neuroscience. 2011 Dec 15;198:138-51. doi: 10.1016/j.neuroscience.2011.07.060. Epub 2011 Jul 29. Neuroscience. 2011. PMID: 21821101 Review.
-
Banishing the homunculus: making working memory work.Neuroscience. 2006 Apr 28;139(1):105-18. doi: 10.1016/j.neuroscience.2005.04.067. Epub 2005 Dec 15. Neuroscience. 2006. PMID: 16343792 Review.
-
How laminar frontal cortex and basal ganglia circuits interact to control planned and reactive saccades.Neural Netw. 2004 May;17(4):471-510. doi: 10.1016/j.neunet.2003.08.006. Neural Netw. 2004. PMID: 15109680
-
Short-term memory traces for action bias in human reinforcement learning.Brain Res. 2007 Jun 11;1153:111-21. doi: 10.1016/j.brainres.2007.03.057. Epub 2007 Mar 24. Brain Res. 2007. PMID: 17459346
-
Goal-directed learning of features and forward models.Neural Netw. 2009 Jul-Aug;22(5-6):586-92. doi: 10.1016/j.neunet.2009.06.049. Epub 2009 Jul 8. Neural Netw. 2009. PMID: 19616917
Cited by
-
Imagery in the entropic associative memory.Sci Rep. 2023 Jun 12;13(1):9553. doi: 10.1038/s41598-023-36761-6. Sci Rep. 2023. PMID: 37308676 Free PMC article.
-
From Focused Thought to Reveries: A Memory System for a Conscious Robot.Front Robot AI. 2018 Apr 4;5:29. doi: 10.3389/frobt.2018.00029. eCollection 2018. Front Robot AI. 2018. PMID: 33500916 Free PMC article.
-
Basal ganglia neurons dynamically facilitate exploration during associative learning.J Neurosci. 2011 Mar 30;31(13):4878-85. doi: 10.1523/JNEUROSCI.3658-10.2011. J Neurosci. 2011. PMID: 21451026 Free PMC article.
-
The modeling and simulation of visuospatial working memory.Cogn Neurodyn. 2010 Dec;4(4):359-66. doi: 10.1007/s11571-010-9129-6. Epub 2010 Aug 25. Cogn Neurodyn. 2010. PMID: 22132045 Free PMC article.
-
Striatal activity during intentional switching depends on pattern stability.J Neurosci. 2010 Mar 3;30(9):3167-74. doi: 10.1523/JNEUROSCI.2673-09.2010. J Neurosci. 2010. PMID: 20203176 Free PMC article.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources