Learning, Reward, and Decision Making

John P O'Doherty¹, Jeffrey Cockburn¹, Wolfgang M Pauli¹

Affiliations

Affiliation

¹ Division of Humanities and Social Sciences and Computation and Neural Systems Program, California Institute of Technology, Pasadena, California 91125; email: jdoherty@caltech.edu.

PMID: 27687119
PMCID: PMC6192677
DOI: 10.1146/annurev-psych-010416-044216

Review

Learning, Reward, and Decision Making

John P O'Doherty et al. Annu Rev Psychol. 2017.

. 2017 Jan 3:68:73-100.

doi: 10.1146/annurev-psych-010416-044216. Epub 2016 Sep 28.

Authors

John P O'Doherty¹, Jeffrey Cockburn¹, Wolfgang M Pauli¹

Affiliation

¹ Division of Humanities and Social Sciences and Computation and Neural Systems Program, California Institute of Technology, Pasadena, California 91125; email: jdoherty@caltech.edu.

PMID: 27687119
PMCID: PMC6192677
DOI: 10.1146/annurev-psych-010416-044216

Abstract

In this review, we summarize findings supporting the existence of multiple behavioral strategies for controlling reward-related behavior, including a dichotomy between the goal-directed or model-based system and the habitual or model-free system in the domain of instrumental conditioning and a similar dichotomy in the realm of Pavlovian conditioning. We evaluate evidence from neuroscience supporting the existence of at least partly distinct neuronal substrates contributing to the key computations necessary for the function of these different control systems. We consider the nature of the interactions between these systems and show how these interactions can lead to either adaptive or maladaptive behavioral outcomes. We then review evidence that an additional system guides inference concerning the hidden states of other agents, such as their beliefs, preferences, and intentions, in a social context. We also describe emerging evidence for an arbitration mechanism between model-based and model-free reinforcement learning, placing such a mechanism within the broader context of the hierarchical control of behavior.

Keywords: Pavlovian; cognitive map; instrumental; model based; model free; outcome valuation.

PubMed Disclaimer

Figures

**Figure 1**
Schematic mapping specific neuroanatomical loci to the implementation of different functions underlying model-based and model-free control. Model-based control depends on a cognitive map of state space and integration of different aspects of a decision, such as effort and estimation uncertainty, as well as the value and the identity of goals or outcomes. Model-free control depends on learning about the value of responses in the current state, based on the history of past reinforcement. The inner circle identifies regions involved in model-based and model-free control, and the outer circle identifies specific subfunctions implemented by particular brain regions, based on the evidence to date as discussed in this review. The objective of this figure is to orient the reader to the location of the relevant brain regions rather than to provide a categorical description of the functions of each region or an exhaustive list of the brain regions involved in reward-related behavior. The neuronal substrates of prediction errors and the loci of arbitration mechanisms are omitted from this figure for simplicity. Y coordinates of coronal brain slices represent their distance from the commissures along the posterior (negative values) to anterior (positive values) axis.

See this image and copyright information in PMC

References

1. Abe H, Lee D. 2011. Distributed coding of actual and hypothetical outcomes in the orbital and dorsolateral prefrontal cortex. Neuron 70(4):731–41 - PMC - PubMed
1. Allman MJ, DeLeon IG, Cataldo MF, Holland PC, Johnson AW. 2010. Learning processes affecting human decision making: an assessment of reinforcer-selective Pavlovian-to-instrumental transfer following reinforcer devaluation. J. Exp. Psychol. Anim. Behav. Process 36(3):402–8 - PubMed
1. Andersen RA, Snyder LH, Bradley DC, Xing J. 1997. Multimodal representation of space in the posterior parietal cortex and its use in planning movements. Annu. Rev. Neurosci 20:303–30 - PubMed
1. Applegate CD, Frysinger RC, Kapp BS, Gallagher M. 1982. Multiple unit activity recorded from amygdala central nucleus during Pavlovian heart rate conditioning in rabbit. Brain Res 238(2):457–62 - PubMed
1. Ariely D, Gneezy U, Loewenstein G, Mazar N. 2009. Large stakes and big mistakes. Rev. Econ. Stud 76(2):451–69

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Learning, Reward, and Decision Making

Affiliation

Learning, Reward, and Decision Making

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources