Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2017 Jan 3:68:73-100.
doi: 10.1146/annurev-psych-010416-044216. Epub 2016 Sep 28.

Learning, Reward, and Decision Making

Affiliations
Review

Learning, Reward, and Decision Making

John P O'Doherty et al. Annu Rev Psychol. .

Abstract

In this review, we summarize findings supporting the existence of multiple behavioral strategies for controlling reward-related behavior, including a dichotomy between the goal-directed or model-based system and the habitual or model-free system in the domain of instrumental conditioning and a similar dichotomy in the realm of Pavlovian conditioning. We evaluate evidence from neuroscience supporting the existence of at least partly distinct neuronal substrates contributing to the key computations necessary for the function of these different control systems. We consider the nature of the interactions between these systems and show how these interactions can lead to either adaptive or maladaptive behavioral outcomes. We then review evidence that an additional system guides inference concerning the hidden states of other agents, such as their beliefs, preferences, and intentions, in a social context. We also describe emerging evidence for an arbitration mechanism between model-based and model-free reinforcement learning, placing such a mechanism within the broader context of the hierarchical control of behavior.

Keywords: Pavlovian; cognitive map; instrumental; model based; model free; outcome valuation.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic mapping specific neuroanatomical loci to the implementation of different functions underlying model-based and model-free control. Model-based control depends on a cognitive map of state space and integration of different aspects of a decision, such as effort and estimation uncertainty, as well as the value and the identity of goals or outcomes. Model-free control depends on learning about the value of responses in the current state, based on the history of past reinforcement. The inner circle identifies regions involved in model-based and model-free control, and the outer circle identifies specific subfunctions implemented by particular brain regions, based on the evidence to date as discussed in this review. The objective of this figure is to orient the reader to the location of the relevant brain regions rather than to provide a categorical description of the functions of each region or an exhaustive list of the brain regions involved in reward-related behavior. The neuronal substrates of prediction errors and the loci of arbitration mechanisms are omitted from this figure for simplicity. Y coordinates of coronal brain slices represent their distance from the commissures along the posterior (negative values) to anterior (positive values) axis.

Similar articles

Cited by

References

    1. Abe H, Lee D. 2011. Distributed coding of actual and hypothetical outcomes in the orbital and dorsolateral prefrontal cortex. Neuron 70(4):731–41 - PMC - PubMed
    1. Allman MJ, DeLeon IG, Cataldo MF, Holland PC, Johnson AW. 2010. Learning processes affecting human decision making: an assessment of reinforcer-selective Pavlovian-to-instrumental transfer following reinforcer devaluation. J. Exp. Psychol. Anim. Behav. Process 36(3):402–8 - PubMed
    1. Andersen RA, Snyder LH, Bradley DC, Xing J. 1997. Multimodal representation of space in the posterior parietal cortex and its use in planning movements. Annu. Rev. Neurosci 20:303–30 - PubMed
    1. Applegate CD, Frysinger RC, Kapp BS, Gallagher M. 1982. Multiple unit activity recorded from amygdala central nucleus during Pavlovian heart rate conditioning in rabbit. Brain Res 238(2):457–62 - PubMed
    1. Ariely D, Gneezy U, Loewenstein G, Mazar N. 2009. Large stakes and big mistakes. Rev. Econ. Stud 76(2):451–69