Learning, Reward, and Decision Making
- PMID: 27687119
- PMCID: PMC6192677
- DOI: 10.1146/annurev-psych-010416-044216
Learning, Reward, and Decision Making
Abstract
In this review, we summarize findings supporting the existence of multiple behavioral strategies for controlling reward-related behavior, including a dichotomy between the goal-directed or model-based system and the habitual or model-free system in the domain of instrumental conditioning and a similar dichotomy in the realm of Pavlovian conditioning. We evaluate evidence from neuroscience supporting the existence of at least partly distinct neuronal substrates contributing to the key computations necessary for the function of these different control systems. We consider the nature of the interactions between these systems and show how these interactions can lead to either adaptive or maladaptive behavioral outcomes. We then review evidence that an additional system guides inference concerning the hidden states of other agents, such as their beliefs, preferences, and intentions, in a social context. We also describe emerging evidence for an arbitration mechanism between model-based and model-free reinforcement learning, placing such a mechanism within the broader context of the hierarchical control of behavior.
Keywords: Pavlovian; cognitive map; instrumental; model based; model free; outcome valuation.
Figures

Similar articles
-
Goal-directed decision making as probabilistic inference: a computational framework and potential neural correlates.Psychol Rev. 2012 Jan;119(1):120-54. doi: 10.1037/a0026435. Psychol Rev. 2012. PMID: 22229491 Free PMC article.
-
Multiple memory systems as substrates for multiple decision systems.Neurobiol Learn Mem. 2015 Jan;117:4-13. doi: 10.1016/j.nlm.2014.04.014. Epub 2014 May 15. Neurobiol Learn Mem. 2015. PMID: 24846190 Free PMC article.
-
Navigating complex decision spaces: Problems and paradigms in sequential choice.Psychol Bull. 2014 Mar;140(2):466-86. doi: 10.1037/a0033455. Epub 2013 Jul 8. Psychol Bull. 2014. PMID: 23834192 Free PMC article. Review.
-
The ubiquity of model-based reinforcement learning.Curr Opin Neurobiol. 2012 Dec;22(6):1075-81. doi: 10.1016/j.conb.2012.08.003. Epub 2012 Sep 6. Curr Opin Neurobiol. 2012. PMID: 22959354 Free PMC article. Review.
-
Speed/accuracy trade-off between the habitual and the goal-directed processes.PLoS Comput Biol. 2011 May;7(5):e1002055. doi: 10.1371/journal.pcbi.1002055. Epub 2011 May 26. PLoS Comput Biol. 2011. PMID: 21637741 Free PMC article.
Cited by
-
Misperceiving Momentum: Computational Mechanisms of Biased Striatal Reward Prediction Errors in Bipolar Disorder.Biol Psychiatry Glob Open Sci. 2024 Apr 30;4(4):100330. doi: 10.1016/j.bpsgos.2024.100330. eCollection 2024 Jul. Biol Psychiatry Glob Open Sci. 2024. PMID: 39132577 Free PMC article.
-
No substantial change in the balance between model-free and model-based control via training on the two-step task.PLoS Comput Biol. 2019 Nov 14;15(11):e1007443. doi: 10.1371/journal.pcbi.1007443. eCollection 2019 Nov. PLoS Comput Biol. 2019. PMID: 31725719 Free PMC article.
-
Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T.Hum Brain Mapp. 2022 Oct 15;43(15):4750-4790. doi: 10.1002/hbm.25988. Epub 2022 Jul 21. Hum Brain Mapp. 2022. PMID: 35860954 Free PMC article.
-
Systematic Review of Functional MRI Applications for Psychiatric Disease Subtyping.Front Psychiatry. 2021 Oct 22;12:665536. doi: 10.3389/fpsyt.2021.665536. eCollection 2021. Front Psychiatry. 2021. PMID: 34744805 Free PMC article.
-
Anxiety associated with perceived uncontrollable stress enhances expectations of environmental volatility and impairs reward learning.Sci Rep. 2023 Oct 27;13(1):18451. doi: 10.1038/s41598-023-45179-z. Sci Rep. 2023. PMID: 37891204 Free PMC article.
References
-
- Allman MJ, DeLeon IG, Cataldo MF, Holland PC, Johnson AW. 2010. Learning processes affecting human decision making: an assessment of reinforcer-selective Pavlovian-to-instrumental transfer following reinforcer devaluation. J. Exp. Psychol. Anim. Behav. Process 36(3):402–8 - PubMed
-
- Andersen RA, Snyder LH, Bradley DC, Xing J. 1997. Multimodal representation of space in the posterior parietal cortex and its use in planning movements. Annu. Rev. Neurosci 20:303–30 - PubMed
-
- Applegate CD, Frysinger RC, Kapp BS, Gallagher M. 1982. Multiple unit activity recorded from amygdala central nucleus during Pavlovian heart rate conditioning in rabbit. Brain Res 238(2):457–62 - PubMed
-
- Ariely D, Gneezy U, Loewenstein G, Mazar N. 2009. Large stakes and big mistakes. Rev. Econ. Stud 76(2):451–69
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources