Reinforcement learning models and their neural correlates: An activation likelihood estimation meta-analysis
- PMID: 25665667
- PMCID: PMC4437864
- DOI: 10.3758/s13415-015-0338-7
Reinforcement learning models and their neural correlates: An activation likelihood estimation meta-analysis
Abstract
Reinforcement learning describes motivated behavior in terms of two abstract signals. The representation of discrepancies between expected and actual rewards/punishments-prediction error-is thought to update the expected value of actions and predictive stimuli. Electrophysiological and lesion studies have suggested that mesostriatal prediction error signals control behavior through synaptic modification of cortico-striato-thalamic networks. Signals in the ventromedial prefrontal and orbitofrontal cortex are implicated in representing expected value. To obtain unbiased maps of these representations in the human brain, we performed a meta-analysis of functional magnetic resonance imaging studies that had employed algorithmic reinforcement learning models across a variety of experimental paradigms. We found that the ventral striatum (medial and lateral) and midbrain/thalamus represented reward prediction errors, consistent with animal studies. Prediction error signals were also seen in the frontal operculum/insula, particularly for social rewards. In Pavlovian studies, striatal prediction error signals extended into the amygdala, whereas instrumental tasks engaged the caudate. Prediction error maps were sensitive to the model-fitting procedure (fixed or individually estimated) and to the extent of spatial smoothing. A correlate of expected value was found in a posterior region of the ventromedial prefrontal cortex, caudal and medial to the orbitofrontal regions identified in animal studies. These findings highlight a reproducible motif of reinforcement learning in the cortico-striatal loops and identify methodological dimensions that may influence the reproducibility of activation patterns across studies.
Conflict of interest statement
The authors declare no financial conflicts of interest that may have biased the present work.
Figures
References
-
- Balleine BW, Dickinson A. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology. 1998;37(4–5):407–419. - PubMed
-
- Behrens TEJ, Woolrich MW, Walton ME, Rushworth MFS. Learning the value of information in an uncertain world. Nat Neurosci. 2007;10(9):1214–1221. http://www.nature.com/neuro/journal/v10/n9/suppinfo/nn1954_S1.html. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
