Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review

Reward Predictions and Computations

In: Neurobiology of Sensation and Reward. Boca Raton (FL): CRC Press/Taylor & Francis; 2011. Chapter 14.
Affiliations
Review

Reward Predictions and Computations

John P. O’Doherty.

Excerpt

The ability to predict when and where a reward will occur enables humans and other animals to initiate behavioral responses prospectively in order to maximize the probability of obtaining that reward. Reward predictions can take a number of distinct forms depending on the nature of the associative relationship underpinning them (Balleine et al. 2008; and see Chapters 13 and 15 in this volume). The simplest form of reward prediction is one based on an associative “Pavlovian” relationship between arbitrary stimuli and rewards, acquired following experience of repeated contingent pairing of the stimulus with the reward. Subsequent presentation of the stimulus elicits a predictive representation of the reward, by virtue of the learned stimulus-reward association. This form of prediction is purely passive: it signals when a reward might be expected to occur and elicits Pavlovian conditioned reflexes, but does not inform about the specific behavioral actions that should be initiated in order to obtain it. By contrast, other forms of reward prediction are grounded in learned instrumental associations between stimuli, responses, and rewards, thereby informing about the specific behavioral responses that, when performed by the animal, lead to a greater probability of obtaining that reward. Instrumental reward predictions can be either goal directed (based on response-outcome associations and therefore sensitive to the incentive value of the outcome), or habitual (based on stimulus-response associations and hence insensitive to changes in outcome value) (Balleine and Dickinson 1998). In this chapter, we will review evidence for the presence of multiple types of predictive reward signal in the brain. We will also outline some of the candidate computational mechanisms that might be responsible for the acquisition of these different forms of reward predictions and evaluate evidence for the presence of such mechanisms in the brain.

PubMed Disclaimer

References

    1. Adams C. D. Variations in the sensitivity of instrumental responding to reinforcer devaluation. Q J Exp Psychol. 1981;34B:77–98.
    1. Balleine B. W., Daw N. D., O’Doherty J. P. Multiple forms of value learning and the function of dopamine. Glimcher P. W., Camerer C. F., Poldrack R. A., Fehr E. New York: Academic Press; Neuroeconomics: Decision Making and the Brain. 2008:367–87.
    1. Balleine B. W., Dickinson A. Goal-directed instrumental action: Contingency and incentive learning and their cortical substrates. Neuropharmacology. 1998;37:407–19. - PubMed
    1. Barto A. G. Reinforcement learning and adaptive critic methods. White D. A., Sofge D. A. New York: Van Norstrand Reinhold; Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches. 1992:469–91.
    1. Barto A. G. Adaptive critics and the basal ganglia. Houk J. C., Davis J. L., Beiser B. G. Models of Information Processing in the Basal Ganglia. 1995:215–32. Cambridge, MA: MIT Press.