Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Dec 18:8:e51439.
doi: 10.7554/eLife.51439.

Catecholaminergic modulation of meta-learning

Affiliations

Catecholaminergic modulation of meta-learning

Jennifer L Cook et al. Elife. .

Abstract

The remarkable expedience of human learning is thought to be underpinned by meta-learning, whereby slow accumulative learning processes are rapidly adjusted to the current learning environment. To date, the neurobiological implementation of meta-learning remains unclear. A burgeoning literature argues for an important role for the catecholamines dopamine and noradrenaline in meta-learning. Here, we tested the hypothesis that enhancing catecholamine function modulates the ability to optimise a meta-learning parameter (learning rate) as a function of environmental volatility. 102 participants completed a task which required learning in stable phases, where the probability of reinforcement was constant, and volatile phases, where probabilities changed every 10-30 trials. The catecholamine transporter blocker methylphenidate enhanced participants' ability to adapt learning rate: Under methylphenidate, compared with placebo, participants exhibited higher learning rates in volatile relative to stable phases. Furthermore, this effect was significant only with respect to direct learning based on the participants' own experience, there was no significant effect on inferred-value learning where stimulus values had to be inferred. These data demonstrate a causal link between catecholaminergic modulation and the adjustment of the meta-learning parameter learning rate.

Keywords: dopamine; human; learning rate; meta-learning; methylphenidate; neuroscience; noradrenaline; volatility.

PubMed Disclaimer

Conflict of interest statement

JC, JS, MF, AD, DG No competing interests declared, Hd has acted as consultant for Eleusis benefit corps but does not own shares. Eleusis have no involvement in this study, RC has acted as a consultant for Pfizer and Abbvie but does not own shares. Pfizer and Abbvie have no involvement in this study

Figures

Figure 1.
Figure 1.. Task design.
Participants selected between two bandits (blue and green boxes) in order to win points. On each trial, participants saw the direct sources (boxes, 1–4 s), subsequently either the blue or green box was highlighted with a red frame (the indirect source, 1–4 s). Participants were instructed that the frame represented either the most popular choice made by a group of participants who had completed the task previously (Social Group) or the choice from rigged roulette wheels (Non-social Group). After participants had responded, their selected option was framed in grey. After 0.5–2 s participants received feedback in the form of a green or blue box between the two bandits. A new trial began after 1–3 s. Success resulted in the red reward bar progressing towards the silver and gold goals. The probability of reward associated with the blue and green boxes and the probability that the red frame surrounded the correct box varied according to probabilistic schedules which comprised stable and volatile phases.
Figure 1—figure supplement 1.
Figure 1—figure supplement 1.. Probabilistic schedules.
Solid blue lines show the probability of blue being the correct choice, dashed red lines show the probability of the red frame information being correct. For the win-stay, lose-shift analysis the interaction between drug, learning type and volatility remained significant when probabilistic schedule group (1,2,3,4) was included as a factor in the ANOVA (F(1,94) = 7.935, p = 0.006). There was no significant main effect of schedule (F(3,94) = 0.906, p = 0.441) and no interactions involving both drug and schedule (all p>0.05). For the learning rate analysis, the interaction between drug, learning type and volatility remained significant when schedule (1,2,3,4) was included as a factor in the ANOVA (F(1,94) = 7.500, p = 0.007). There was no significant main effect of schedule (F(3,94) = 0.850, p = 0.470) and no drug x learning type x volatility x group x schedule interaction (F(3,94) = 0.404, p = 0.751). There was a significant drug x learning type x volatility x probabilistic schedule interaction (F(3,94) = 2.725, p = 0.049). Although the pattern of results was comparable, the interaction between drug x learning type x volatility was only statistically significant for schedule 2 (F(1,23) = 12.231, p = 0.002), and did not reach significance for schedules 1 (F(1,24) = 4.015, p = 0.057), 3 (F(1,22) = 2.881, p = 0.104) and 4 (F(1,25) = 0.501, p = 0.486), likely due to a lack of power with these reduced group sizes.
Figure 2.
Figure 2.. Drug effects on win-stay, lose-shift beta values.
(A) Win-stay, lose-shift betas, for experienced-value learning, in stable and volatile periods, under MPH (purple) and PLA (green). (B) Win-stay, lose-shift betas, for inferred-value learning, in stable and volatile periods, under MPH and PLA. (C) There was a significant interaction between drug and volatility for experienced-value learning, but not for inferred-value learning. (D) Drug effect on win-stay, lose-shift betas for volatile minus stable blocks, for experienced-value (orange) and inferred-value (blue) learning. There was no difference between the social and non-social (roulette) groups. Boxes = standard error of the mean, shaded region = standard deviation, individual datapoints are displayed, MPH = methylphenidate, PLA = placebo, * indicates statistical significance.
Figure 2—figure supplement 1.
Figure 2—figure supplement 1.. Illustration of relationship between experienced-value βWSLS_vol-stable and accuracy.
Experienced-value βWSLS_vol-stable was significantly positively correlated with accuracy under both PLA (Top: Pearson’s r = 0.453, p < 0.001) and MPH (Bottom: Pearson’s r = 0.236, p = 0.017).
Figure 3.
Figure 3.. Drug effects on learning rates.
(A) Learning rate for experienced-value learning, in stable and volatile periods, under MPH (purple) and PLA (green). (B) Learning rate for inferred-value learning, in stable and volatile periods, under MPH and PLA. (C) There was a significant interaction between drug and volatility for experienced-value learning, but not for inferred-value learning. (D) The interaction between drug and volatility was specific to experienced-value learning even for a sub-sample of participants who were keen inferred-value learners. Boxes = standard error of the mean, shaded region = standard deviation, individual datapoints are displayed, MPH = methylphenidate, PLA = placebo, * indicates statistical significance.
Figure 4.
Figure 4.. Experienced-value learning rates under PLA, but not MPH, were further from the optimal value in the stable compared to volatile phase.
Figure 5.
Figure 5.. Participant data (left) juxtaposed against model simulations (right).
Top: Accuracy, in stable and volatile phases, under MPH (purple) and PLA (green). Boxes = standard error of the mean, shaded region = standard deviation, individual datapoints are displayed. Middle and Bottom: Running average, across 5 trials, of blue choices for probabilistic randomisation schedules 1 (middle) and 2 (bottom). Shaded region = standard error of the mean. MPH = methylphenidate, PLA = placebo.
Figure 6.
Figure 6.. Participant data (left) juxtaposed against model simulations (right).
Top: Accuracy, in stable and volatile phases, under MPH (purple) and PLA (green). Boxes = standard error of the mean, shaded region = standard deviation, individual datapoints are displayed. Middle and Bottom: Running average, across 5 trials, of blue choices for probabilistic randomisation schedules 3 (middle) and 4 (bottom). Shaded region = standard error of the mean. MPH = methylphenidate, PLA = placebo.
Appendix 4—figure 1.
Appendix 4—figure 1.. p(y|m) = posterior model probability, ϕ = exceedance probability, MPH = purple, PLA = green.
Author response image 1.
Author response image 1.

References

    1. Aston-Jones GS, Iba M, Clayton E, Rajkowski J, Cohen J. The locus coeruleus and regulation of behavioral flexibility and attention: clinical implications. In: Ordway GA, Schwartz MA, Frazer A, editors. Brain Norepinephrine. Cambridge: Cambridge University Press; 2007. pp. 196–235.
    1. Aston-Jones G, Cohen JD. An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance. Annual Review of Neuroscience. 2005;28:403–450. doi: 10.1146/annurev.neuro.28.061604.135709. - DOI - PubMed
    1. Badre D, Frank MJ. Mechanisms of hierarchical reinforcement learning in cortico-striatal circuits 2: evidence from fMRI. Cerebral Cortex. 2012;22:527–536. doi: 10.1093/cercor/bhr117. - DOI - PMC - PubMed
    1. Barratt W. The Barratt Simplified Measure of Social Status (BSMSS) [December 1, 2014];Social class on campus. 2012 http://socialclassoncampus.blogspot.com/2012/06/barratt-simplified-measu...
    1. Beck AT, Steer RA, Ball R, Ranieri W. Comparison of beck depression inventories -IA and -II in psychiatric outpatients. Journal of Personality Assessment. 1996;67:588–597. doi: 10.1207/s15327752jpa6703_13. - DOI - PubMed

Publication types