Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Nov 13;456(7219):245-9.
doi: 10.1038/nature07538.

Associative learning of social value

Affiliations

Associative learning of social value

Timothy E J Behrens et al. Nature. .

Abstract

Our decisions are guided by information learnt from our environment. This information may come via personal experiences of reward, but also from the behaviour of social partners. Social learning is widely held to be distinct from other forms of learning in its mechanism and neural implementation; it is often assumed to compete with simpler mechanisms, such as reward-based associative learning, to drive behaviour. Recently, neural signals have been observed during social exchange reminiscent of signals seen in studies of associative learning. Here we demonstrate that social information may be acquired using the same associative processes assumed to underlie reward-based learning. We find that key computational variables for learning in the social and reward domains are processed in a similar fashion, but in parallel neural processing streams. Two neighbouring divisions of the anterior cingulate cortex were central to learning about social and reward-based information, and for determining the extent to which each source of information guides behaviour. When making a decision, however, the information learnt using these parallel streams was combined within ventromedial prefrontal cortex. These findings suggest that human social valuation can be realized by means of the same associative processes previously established for learning other, simpler, features of the environment.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Experimental task and behavioural findings. (a) Experimental task (See methods and Supplementary information). Each trial consists of four phases. Subjects are presented with a decision (CUE), receive the advice (red square) of the confederate (SUGGEST) and respond using a button press (grey square). An INTERVAL period follows, before the correct outcome is revealed (MONITOR). If the subject chooses correctly the red bar is incremented by the number of points on the chosen option. (b,c) Reward schedules for reward (b) and social (c) information. Dashed lines show the true probability of blue being correct (b) and the true probability of correct confederate advice (c). Each schedule underwent periods of stability and volatility. Solid lines show the model’s estimate of the probabilities. (d) Optimal model estimates of the volatility of reward (green) and social (red) information. (e) Logistic regression on subject behaviour. Factors included were: The reward magnitude difference between options (RMD); the outcome probability derived from the model using reward outcomes (RLO); the outcome probability derived from the model using confederate advice (RLC); the possibility that the subjects would blindly follow the confederate without learning (BFC); and the possibility that subjects would assume the confederate would behave as in the previous trial (CPT). The logistic regression analysis revealed significant effects only on RMD, RLO and RLC.
Figure 2
Figure 2
Predictions and prediction errors in social and non-social domains. Timecourses show (partial) correlations ± SEM. See figure S2. (a) Activation in the DMPFC, right TPJ/STS and MTG correlate with the social prediction error at the outcome (thresholded at Z>3.1, cluster size >50 voxels). (b) Deconstruction of signal change in the DMPFC. Similar results were found in the MTG and TPJ/STS. Top panel: Following the outcome, areas that encode prediction error correlate positively with the outcome and negatively with the predicted probability. Red: effect size of the confederate lie outcome (1 for lie, 0 for truth). Blue: effect size of the predicted confederate lie probability. To perform inference, we fit a hemodynamic model in each subject to the timecourse of this effect (i.e. to the blue line). The green line in the top panel shows the mean overall fit of this hemodynamic model (for comparison with the blue line). Bottom panel: The effect of lie probability (blue line from top panel) is decomposed into an hrf at each trial event (fig S2). Dashed and solid lines show mean responses±s.e.m. Each region showed a significant positive effect of predicted confederate lie probability after the decision (t(22)=1.96 (p<0.05), 1.73(p<0.05), 1.74(p<0.05) for DMPFC, MTG and TPJ/STS respectively). Crucially, each brain region showed a significant negative effect of predicted confederate lie probability after the outcome (t(22)=2.68 (p<0.005), 2.35 (p<0.05), 3.27 (p<0.005)). (c) Ventral striatum is taken as an example of a number of regions revealed by the voxelwise analysis of reward prediction error (thresholded at Z>3.1, cluster size >100 voxels) (d) Panels are exactly as in (b), but coded in terms of reward and not in terms of confederate fidelity. Top panel shows the parameter estimate relating to the expected value of the trial (blue line) and, after the outcome, the parameter estimate relating to the magnitude of these rewards (grey line). To test for prediction error coding, we again fit a hemodynamic model to the expectation parameter estimate (shown by the green line, for comparison with blue line). Bottom panel: The timecourse showed a significant positive effect during the time of the decision (t(22)=3.32 (p<0.002)), and a significant negative effect after the trial outcome (t(22)=2.50, p<0.05) - see supplementary information for further discussion.
Figure 3
Figure 3
Agency-specific learning rates dissociate in the ACC (a) Regions where the BOLD correlates of reward (green) and confederate (red) volatility predict the influence that each source of information has on subject behaviour (Z>3.1, p<0.05 cluster corrected for cingulate cortex). Subjects with high BOLD signal changes in response to reward volatility in the ACC sulcus are guided strongly by reward history information (max Z=3.7, correlation (R=0.7163, p<0.0001) shown in (b)). Subjects with high BOLD signal changes in response to confederate advice volatility in the ACC gyrus are guided strongly by social information (max Z=4.1, correlation (R=0.7252, p<0.0001) shown in (c)).
Figure 4
Figure 4
Combination of expected value of chosen option in VMPFC. (a) Activation for the combination (mean contrast) of experience-based probability during CUE and SUGGEST phases, and advice-based probability during SUGGEST phase (thresholded at Z>3.1, p<0.005 cluster-corrected for VMPFC). These phases represent the times at which subjects had these probabilities available to them (see supplementary information and figure S4). (b) Correlation between effect of outcome-based probability in VMPFC during the decision and effect of outcome volatility in ACCs during MONITOR (R = 0.6119, p<0.0002). (c) Correlation between effect of confederate-based probability in VMPFC during the decision and effect of confederate volatility in ACCs during MONITOR (R = 0.6119, p<0.0002).

Similar articles

Cited by

References

    1. Fehr E, Fischbacher U. The nature of human altruism. Nature. 2003;425:785–91. - PubMed
    1. Maynard Smith J. Evolution and the Theory of Games. Cambridge Univ. Press; 1982.
    1. Delgado MR, Frank RH, Phelps EA. Perceptions of moral character modulate the neural systems of reward during the trust game. Nat Neurosci. 2005;8:1611–8. - PubMed
    1. King-Casas B, et al. Getting to know you: reputation and trust in a two-person economic exchange. Science. 2005;308:78–83. - PubMed
    1. Rilling J, et al. A neural basis for social cooperation. Neuron. 2002;35:395–405. - PubMed

Publication types