Reward-dependent learning in neuronal networks for planning and decision making
- PMID: 11105649
- DOI: 10.1016/S0079-6123(00)26016-0
Reward-dependent learning in neuronal networks for planning and decision making
Abstract
Neuronal network models have been proposed for the organization of evaluation and decision processes in prefrontal circuitry and their putative neuronal and molecular bases. The models all include an implementation and simulation of an elementary reward mechanism. Their central hypothesis is that tentative rules of behavior, which are coded by clusters of active neurons in prefrontal cortex, are selected or rejected based on an evaluation by this reward signal, which may be conveyed, for instance, by the mesencephalic dopaminergic neurons with which the prefrontal cortex is densely interconnected. At the molecular level, the reward signal is postulated to be a neurotransmitter such as dopamine, which exerts a global modulatory action on prefrontal synaptic efficacies, either via volume transmission or via targeted synaptic triads. Negative reinforcement has the effect of destabilizing the currently active rule-coding clusters; subsequently, spontaneous activity varies again from one cluster to another, giving the organism the chance to discover and learn a new rule. Thus, reward signals function as effective selection signals that either maintain or suppress currently active prefrontal representations as a function of their current adequacy. Simulations of this variation-selection have successfully accounted for the main features of several major tasks that depend on prefrontal cortex integrity, such as the delayed-response test, the Wisconsin card sorting test, the Tower of London test and the Stroop test. For the more complex tasks, we have found it necessary to supplement the external reward input with a second mechanism that supplies an internal reward; it consists of an auto-evaluation loop which short-circuits the reward input from the exterior. This allows for an internal evaluation of covert motor intentions without actualizing them as behaviors, by simply testing them covertly by comparison with memorized former experiences. This element of architecture gives access to enhanced rates of learning via an elementary process of internal or covert mental simulation. We have recently applied these ideas to a new model, developed with M. Kerszberg, which hypothesizes that prefrontal cortex and its reward-related connections contribute crucially to conscious effortful tasks. This model distinguishes two main computational spaces within the human brain: a unique global workspace composed of distributed and heavily interconnected neurons with long-range axons, and a set of specialized and modular perceptual, motor, memory, evaluative and attentional processors. We postulate that workspace neurons are mobilized in effortful tasks for which the specialized processors do not suffice; they selectively mobilize or suppress, through descending connections, the contribution of specific processor neurons. In the course of task performance, workspace neurons become spontaneously co-activated, forming discrete though variable spatio-temporal patterns subject to modulation by vigilance signals and to selection by reward signals. A computer simulation of the Stroop task shows workspace activation to increase during acquisition of a novel task, effortful execution, and after errors. This model makes predictions concerning the spatio-temporal activation patterns during brain imaging of cognitive tasks, particularly concerning the conditions of activation of dorsolateral prefrontal cortex and anterior cingulate, their relation to reward mechanisms, and their specific reaction during error processing.
Similar articles
-
A neuronal model of a global workspace in effortful cognitive tasks.Proc Natl Acad Sci U S A. 1998 Nov 24;95(24):14529-34. doi: 10.1073/pnas.95.24.14529. Proc Natl Acad Sci U S A. 1998. PMID: 9826734 Free PMC article.
-
The glutamate hypothesis of reinforcement learning.Prog Brain Res. 2000;126:231-53. doi: 10.1016/S0079-6123(00)26017-2. Prog Brain Res. 2000. PMID: 11105650 Review. No abstract available.
-
Statistical mechanics of reward-modulated learning in decision-making networks.Neural Comput. 2012 May;24(5):1230-70. doi: 10.1162/NECO_a_00264. Epub 2012 Feb 1. Neural Comput. 2012. PMID: 22295982
-
Mechanisms of reinforcement learning and decision making in the primate dorsolateral prefrontal cortex.Ann N Y Acad Sci. 2007 May;1104:108-22. doi: 10.1196/annals.1390.007. Epub 2007 Mar 8. Ann N Y Acad Sci. 2007. PMID: 17347332 Review.
-
Error rate and outcome predictability affect neural activation in prefrontal cortex and anterior cingulate during decision-making.Neuroimage. 2002 Apr;15(4):836-46. doi: 10.1006/nimg.2001.1031. Neuroimage. 2002. PMID: 11906224
Cited by
-
How does reward expectation influence cognition in the human brain?J Cogn Neurosci. 2008 Nov;20(11):1980-92. doi: 10.1162/jocn.2008.20140. J Cogn Neurosci. 2008. PMID: 18416677 Free PMC article.
-
The nature of blindsight: implications for current theories of consciousness.Neurosci Conscious. 2022 Feb 28;2022(1):niab043. doi: 10.1093/nc/niab043. eCollection 2022. Neurosci Conscious. 2022. PMID: 35237447 Free PMC article. Review.
-
White matter alterations in anorexia nervosa: A systematic review of diffusion tensor imaging studies.World J Psychiatry. 2016 Mar 22;6(1):177-86. doi: 10.5498/wjp.v6.i1.177. eCollection 2016 Mar 22. World J Psychiatry. 2016. PMID: 27014606 Free PMC article.
-
Advances from neuroimaging studies in eating disorders.CNS Spectr. 2015 Aug;20(4):391-400. doi: 10.1017/S1092852915000012. Epub 2015 Apr 23. CNS Spectr. 2015. PMID: 25902917 Free PMC article. Review.
-
The role of attention in conscious recollection.Front Psychol. 2012 Feb 10;3:29. doi: 10.3389/fpsyg.2012.00029. eCollection 2012. Front Psychol. 2012. PMID: 22363305 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources