Actor-critic models of the basal ganglia: new anatomical and computational perspectives
- PMID: 12371510
- DOI: 10.1016/s0893-6080(02)00047-3
Actor-critic models of the basal ganglia: new anatomical and computational perspectives
Abstract
A large number of computational models of information processing in the basal ganglia have been developed in recent years. Prominent in these are actor-critic models of basal ganglia functioning, which build on the strong resemblance between dopamine neuron activity and the temporal difference prediction error signal in the critic, and between dopamine-dependent long-term synaptic plasticity in the striatum and learning guided by a prediction error signal in the actor. We selectively review several actor-critic models of the basal ganglia with an emphasis on two important aspects: the way in which models of the critic reproduce the temporal dynamics of dopamine firing, and the extent to which models of the actor take into account known basal ganglia anatomy and physiology. To complement the efforts to relate basal ganglia mechanisms to reinforcement learning (RL), we introduce an alternative approach to modeling a critic network, which uses Evolutionary Computation techniques to 'evolve' an optimal RL mechanism, and relate the evolved mechanism to the basic model of the critic. We conclude our discussion of models of the critic by a critical discussion of the anatomical plausibility of implementations of a critic in basal ganglia circuitry, and conclude that such implementations build on assumptions that are inconsistent with the known anatomy of the basal ganglia. We return to the actor component of the actor-critic model, which is usually modeled at the striatal level with very little detail. We describe an alternative model of the basal ganglia which takes into account several important, and previously neglected, anatomical and physiological characteristics of basal ganglia-thalamocortical connectivity and suggests that the basal ganglia performs reinforcement-biased dimensionality reduction of cortical inputs. We further suggest that since such selective encoding may bias the representation at the level of the frontal cortex towards the selection of rewarded plans and actions, the reinforcement-driven dimensionality reduction framework may serve as a basis for basal ganglia actor models. We conclude with a short discussion of the dual role of the dopamine signal in RL and in behavioral switching.
Similar articles
-
[Morphological Re-evaluation of the Basal Ganglia Network].Brain Nerve. 2016 Jul;68(7):861-4. doi: 10.11477/mf.1416200519. Brain Nerve. 2016. PMID: 27395470 Review. Japanese.
-
A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task.Neuroscience. 1999;91(3):871-90. doi: 10.1016/s0306-4522(98)00697-6. Neuroscience. 1999. PMID: 10391468
-
The many worlds hypothesis of dopamine prediction error: implications of a parallel circuit architecture in the basal ganglia.Curr Opin Neurobiol. 2017 Oct;46:241-247. doi: 10.1016/j.conb.2017.08.015. Epub 2017 Oct 3. Curr Opin Neurobiol. 2017. PMID: 28985550 Review.
-
Integration of reinforcement learning and optimal decision-making theories of the basal ganglia.Neural Comput. 2011 Apr;23(4):817-51. doi: 10.1162/NECO_a_00103. Epub 2011 Jan 11. Neural Comput. 2011. PMID: 21222528
-
Information processing, dimensionality reduction and reinforcement learning in the basal ganglia.Prog Neurobiol. 2003 Dec;71(6):439-73. doi: 10.1016/j.pneurobio.2003.12.001. Prog Neurobiol. 2003. PMID: 15013228 Review.
Cited by
-
Learning and generalization from reward and punishment in opioid addiction.Behav Brain Res. 2017 Jan 15;317:122-131. doi: 10.1016/j.bbr.2016.09.033. Epub 2016 Sep 15. Behav Brain Res. 2017. PMID: 27641323 Free PMC article.
-
Selective consolidation of learning and memory via recall-gated plasticity.Elife. 2024 Jul 18;12:RP90793. doi: 10.7554/eLife.90793. Elife. 2024. PMID: 39023518 Free PMC article.
-
Combining backpropagation with Equilibrium Propagation to improve an Actor-Critic reinforcement learning framework.Front Comput Neurosci. 2022 Aug 23;16:980613. doi: 10.3389/fncom.2022.980613. eCollection 2022. Front Comput Neurosci. 2022. PMID: 36082305 Free PMC article.
-
Distinct neural representation in the dorsolateral, dorsomedial, and ventral parts of the striatum during fixed- and free-choice tasks.J Neurosci. 2015 Feb 25;35(8):3499-514. doi: 10.1523/JNEUROSCI.1962-14.2015. J Neurosci. 2015. PMID: 25716849 Free PMC article.
-
Acetylcholine-based entropy in response selection: a model of how striatal interneurons modulate exploration, exploitation, and response variability in decision-making.Front Neurosci. 2012 Feb 6;6:18. doi: 10.3389/fnins.2012.00018. eCollection 2012. Front Neurosci. 2012. PMID: 22347164 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources