The ubiquity of model-based reinforcement learning
- PMID: 22959354
- PMCID: PMC3513648
- DOI: 10.1016/j.conb.2012.08.003
The ubiquity of model-based reinforcement learning
Abstract
The reward prediction error (RPE) theory of dopamine (DA) function has enjoyed great success in the neuroscience of learning and decision-making. This theory is derived from model-free reinforcement learning (RL), in which choices are made simply on the basis of previously realized rewards. Recently, attention has turned to correlates of more flexible, albeit computationally complex, model-based methods in the brain. These methods are distinguished from model-free learning by their evaluation of candidate actions using expected future outcomes according to a world model. Puzzlingly, signatures from these computations seem to be pervasive in the very same regions previously thought to support model-free learning. Here, we review recent behavioral and neural evidence about these two systems, in attempt to reconcile their enigmatic cohabitation in the brain.
Copyright © 2012 Elsevier Ltd. All rights reserved.
Figures
References
-
- Barto AG. Adaptive critics and the basal ganglia. In: Houk JC, Davis JL, Beiser DG, editors. Models of information processing in the basal ganglia. Cambridge, MA: MIT Press; 1995. pp. 215–232. Ch. xii.
-
- Thorndike EL. Animal intelligence: An experimental study of the associative processes in animals. Psychological Review Monograph Supplement. 1898;2(4):1–8.
-
- Tolman EC. Cognitive maps in rats and men. Psychol Rev. 1948;55:189–208. - PubMed
-
- Daw ND, Niv Y, Dayan P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci. 2005;8(12):1704–1711. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
