Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2011 Jun;125(3):297-317.
doi: 10.1037/a0023575.

Decision making and reward in frontal cortex: complementary evidence from neurophysiological and neuropsychological studies

Affiliations
Free PMC article
Review

Decision making and reward in frontal cortex: complementary evidence from neurophysiological and neuropsychological studies

Steven W Kennerley et al. Behav Neurosci. 2011 Jun.
Free PMC article

Abstract

Patients with damage to the prefrontal cortex (PFC)--especially the ventral and medial parts of PFC--often show a marked inability to make choices that meet their needs and goals. These decision-making impairments often reflect both a deficit in learning concerning the consequences of a choice, as well as deficits in the ability to adapt future choices based on experienced value of the current choice. Thus, areas of PFC must support some value computations that are necessary for optimal choice. However, recent frameworks of decision making have highlighted that optimal and adaptive decision making does not simply rest on a single computation, but a number of different value computations may be necessary. Using this framework as a guide, we summarize evidence from both lesion studies and single-neuron physiology for the representation of different value computations across PFC areas.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Single neurons encode value and actions in a multivariable decision-making task. (A) Subjects made choices between pairs of presented pictures. (B) There were six sets of pictures, each associated with a specific outcome. We varied the value of the outcome by manipulating either the amount of reward the subject would receive (payoff), the likelihood of receiving a reward (probability) or the number of times the subject had to press a level to earn the reward (effort). We manipulated one parameter at time, holding the other two fixed. Presented pictures were always adjacent to one another in terms of value, that is, choices were 1 versus 2, 2 versus 3, 3 versus 4 or 4 versus 5. (C and D) Spike density histograms illustrating the activity recorded from single neurons under three different types of value manipulation (probability, payoff, or effort). The vertical lines indicate the onset of the pictures indicating the value of the choice (left) and the time at which the animal was able to make his choice (right). The different colored lines indicate the value of the choice under consideration or which action the subject would select. (C) Anterior cingulate cortex (ACC) neuron encodes payoff and effort but not probability. (D) ACC neuron encodes the value and action of all three decision variables. (E) Percentage of all neurons selective for value for each decision variable. All variables are predominately coded in ACC. (F) Percentage of all neurons selective for value as a function of number of decision variables encoded. ACC neurons tend to multiplex decision value across two (as in C) and three (as in D) decision variables. (G) Percentage of all neurons selective for action for each decision variable. Orbitofrontal cortex (OFC) neurons are less likely to encode action information relative to lateral prefrontal cortex (LPFC) and ACC. χ2 test, * p < .05. From “Neurons in the Frontal Lobe Encode the Value of Multiple Decision Variables,” by S. W. Kennerley, A. F. Dahmubed, A. H. Lara, and J. D. Wallis, 2009Journal of Cognitive Neuroscience, 21Figure 1. Copyright 2008 by the Massachusetts Institute of Technology. Adapted with permission.
Figure 2
Figure 2. The influence of reward on spatial tuning in a delayed response task. (A) In the reward-space (RS) task, the subject sees two cues separated by a delay. The first cue indicates the amount of juice to expect for successful performance of the task, and the second cue indicates the location the subject must maintain in spatial working memory. The subject indicates his response by making a saccade to the location of the mnemonic cue 1 s later. The fixation cue changes to yellow to tell the subject to initiate his saccade. The space-reward (SR) task is identical except the cues appear in the opposite order. There are five different reward amounts, each predicted by one of two cues, and 24 spatial locations. (B and C) Spike density histograms of single neurons illustrating how the size of an expected reward can modulate spatial tuning of information held in working memory. The graphs illustrate neuronal activity as animals remember different locations on a computer screen under the expectancy of receiving either a small or a large reward for correct performance. The gray bar indicates the presentation of the mnemonic spatial cue. To enable clear visualization, the spatial data is collapsed into four groups consisting of six of the 24 possible spatial locations tested. The inset indicates the mean standardized firing rate of the neuron across the 24 spatial locations. (B) When the subject expected a small reward, the neuron showed little spatial selectivity, which consisted of an increase in firing rate when the subject was remembering locations in the top left of the screen. When the subject expected a large reward for correct performance, spatial selectivity dramatically increased with a high firing rate for locations in the top left of the screen and a low firing rate for locations in the bottom right. Spatial selectivity was primarily evident only during cue presentation. (C) A neuron that showed moderate spatial selectivity when the subject expected a small reward, but a dramatic increase in spatial selectivity for targets in the top right when the subject expected a large reward. This reward modulation of spatial selectivity persisted into the delay period, indicating a reward modulation of the information contained in working memory. Panel A from “Reward-Dependent Modulation of Working Memory in Lateral Prefrontal Cortex,” by S. W. Kennerley and J. D. Wallis, 2009, Journal of Neuroscience, 29, p. 3260, Figure 1J. D. Wallis and S. W. Kennerley, 2010Figure 3
Figure 3
Figure 3. Single neurons encode reward prediction errors. (A and B) Spike density histograms illustrating the activity of single neurons synched to the presentation of conditioned stimuli (left panels) associated with different probabilities of reward delivery, or synched to the onset of reward on rewarded trials (middle columns) or the expected onset of reward on nonrewarded trial (right columns). The vertical lines in the left panel indicate the onset of the choice stimuli (left) and the time at which the animal was able to make his choice (right); the vertical lines in the middle and right columns indicate the time at which the reward was (rewarded trial) or would have been (nonrewarded trial) delivered following the choice. The different colored lines indicate the value of the chosen probability stimulus which also determines the size of the prediction error (PE) where PE equals 1 - chosen probability for rewarded trials and 0 - chosen probability for nonrewarded trials. The lower row of plots indicates the regression coefficients (RC) from a sliding linear regression, testing the relationship between the neuron's firing rate and the probability of reward delivery. Red data points indicate time points in which the probability of reward delivery (or size of prediction error) significantly predicted the neuron's firing rate. (A) Anterior cingulate cortex (ACC) neuron encodes expected probability at the time of choice (left panel). This neuron also encodes a positive prediction error at the time of reward onset (middle column), but is insensitive to negative prediction errors (right column). (B) ACC neuron encodes expected probability at the time of choice (left panel), encodes positive prediction errors on rewarded trials (middle column), and encodes negative prediction errors on nonrewarded trials (right column). (C) Percentage of all neurons selective for positive prediction errors only, negative prediction errors only, or both positive and negative prediction errors. ACC neurons are more likely to encode positive prediction errors or both positive and negative prediction errors relative to lateral prefrontal cortex (LPFC) and OFC. OFC = orbitofrontal cortex. χ2 test, * p < .05.
Figure 4
Figure 4. Use of reward information as evidence for selecting the correct response. (A and B) Schematics of the stimulus- (panel A) and action-based reversal learning tasks (panel B). For the stimulus-based task, animals chose between two stimuli presented on the left and right of a touchscreen, only one of which was associated with reward across a block of trials. Stimulus-outcome contingencies reversed (i.e., the other stimulus became the rewarded stimulus) after animals had performed at 90% correct (27/30) across 2 days of testing. For the action-based task, animals chose between two joystick movements, only one of which was associated with reward across a block of trials. Action-outcome contingencies reversed (i.e., the other action became the rewarded action) after animals had gained 25 rewards for a particular action. (C) Effect of orbitofrontal cortex (OFC) lesions on using reinforcement information in the stimulus-based reversal task. (From “Amygdala and Orbitofrontal Cortex Lesions Differentially Influence Choices During Object Reversal Learning,” by P. H. Rudebeck and E. A. Murray, 2008Journal of Neuroscience, 28Figure 5. Copyright 2008 by E. A. Murray. Adapted with permission). (D) Effect of OFC lesions on using reinforcement information in the action-based joystick reversal task (From “Frontal cortex subregions play distinct roles in choices between actions and stimuli,” by P. H. Rudebeck, T. E. Behrens, S. W. Kennerley, M. G. Baxter, M. J. Buckley, M. E. Walton, and M. F. Rushworth, 2008Journal of Neuroscience, 28Figure 5. Copyright 2008 by S. W. Kennerley. Adapted with permission). (E) Effect of anterior cingulate cortex (ACC) lesions on using reinforcement information in the stimulus-based reversal task. (F) Effect of ACC lesions on using reinforcement information in the action-based joystick reversal task (From “Optimal Decision Making and the Anterior Cingulate Cortex,” by S. W. Kennerley, M. E. Walton, T. E. Behrens, M. J. Buckley, and M. F. Rushworth, 2006Nature Neuroscience, 9Figure 3. Copyright 2006 by S. W. Kennerley. Adapted with permission). E + 1 = performance on a trial after an error; EC + 1 = performance on a trial following a single correct response after an error; EC (N) + 1 = performance on a trial following N correct responses after an error. Data are included for each trial type for which every animal had at least 10 instances (which is why there are fewer trial types in the analysis of the stimulus-based reversal task in Rudebeck & Murray, 2008
Figure 5
Figure 5. Choices on a changeable three-armed bandit task and the influences on current behavior. (A) Two example predetermined reward schedules. The schedules determined whether reward was delivered for selecting a stimulus (stimulus A to C) on a particular trial. Dashed black lines represent the reversal point in the schedule when the identity of the highest value stimulus changes. (B) Average likelihood of choosing the highest value stimulus in the two schedules in the control (solid black line) and orbitofrontal cortex (OFC) groups (dashed black line). SEMs are filled gray and blue areas respectively for the two groups. Colored points represent the reward probability of the highest value stimulus. (C) Matrix of components included in logistic regression and influence of (i) recent choices and their specific outcomes (red Xs, bottom right graph); (ii) the previous choice and each recent past outcome (blue Xs, top right graph); and (iii) the previous outcome and each recent past choice (green Xs, bottom left graph), on current behavior. Green area represents influence of associations between choices and rewards received in the past; blue area represents the influence of associations between past rewards and choices made in the subsequent trials. The data for the first trial in the past in the three plots are identical. Controls, solid black lines; OFCs, dashed gray lines. From “Separable Learning Systems in the Macaque Brain and the Role of Orbitofrontal Cortex in Contingent Learning,” by M. E. Walton, T. E. Behrens, M. J. Buckley, P. H. Rudebeck, and M. F. Rushworth, 2010Neuron, 65Figures 12

References

    1. Amiez C., Joseph J. P., & Procyk E. (2005). Anterior cingulate error-related activity is modulated by predicted reward. European Journal of Neuroscience, 21, 3447–3452. - PMC - PubMed
    1. Amiez C., Joseph J. P., & Procyk E. (2006). Reward encoding in the monkey anterior cingulate cortex. Cerebral Cortex, 16, 1040–1055. - PMC - PubMed
    1. Aron A. R., Robbins T. W., & Poldrack R. A. (2004). Inhibition and the right inferior frontal cortex. Trends in Cognitive Sciences, 8, 170–177. - PubMed
    1. Awh E., & Jonides J. (2001). Overlapping mechanisms of attention and spatial working memory. Trends in Cognitive Sciences, 5, 119–126. - PubMed
    1. Balleine B. W., & O'Doherty J. P. (2010). Human and rodent homologies in action control: Corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology, 35, 48–69. - PMC - PubMed