Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jun 21;37(25):6087-6097.
doi: 10.1523/JNEUROSCI.2081-16.2017. Epub 2017 May 24.

A Selective Role for Dopamine in Learning to Maximize Reward But Not to Minimize Effort: Evidence from Patients with Parkinson's Disease

Affiliations

A Selective Role for Dopamine in Learning to Maximize Reward But Not to Minimize Effort: Evidence from Patients with Parkinson's Disease

Vasilisa Skvortsova et al. J Neurosci. .

Abstract

Instrumental learning is a fundamental process through which agents optimize their choices, taking into account various dimensions of available options such as the possible reward or punishment outcomes and the costs associated with potential actions. Although the implication of dopamine in learning from choice outcomes is well established, less is known about its role in learning the action costs such as effort. Here, we tested the ability of patients with Parkinson's disease (PD) to maximize monetary rewards and minimize physical efforts in a probabilistic instrumental learning task. The implication of dopamine was assessed by comparing performance ON and OFF prodopaminergic medication. In a first sample of PD patients (n = 15), we observed that reward learning, but not effort learning, was selectively impaired in the absence of treatment, with a significant interaction between learning condition (reward vs effort) and medication status (OFF vs ON). These results were replicated in a second, independent sample of PD patients (n = 20) using a simplified version of the task. According to Bayesian model selection, the best account for medication effects in both studies was a specific amplification of reward magnitude in a Q-learning algorithm. These results suggest that learning to avoid physical effort is independent from dopaminergic circuits and strengthen the general idea that dopaminergic signaling amplifies the effects of reward expectation or obtainment on instrumental behavior.SIGNIFICANCE STATEMENT Theoretically, maximizing reward and minimizing effort could involve the same computations and therefore rely on the same brain circuits. Here, we tested whether dopamine, a key component of reward-related circuitry, is also implicated in effort learning. We found that patients suffering from dopamine depletion due to Parkinson's disease were selectively impaired in reward learning, but not effort learning. Moreover, anti-parkinsonian medication restored the ability to maximize reward, but had no effect on effort minimization. This dissociation suggests that the brain has evolved separate, domain-specific systems for instrumental learning. These results help to disambiguate the motivational role of prodopaminergic medications: they amplify the impact of reward without affecting the integration of effort cost.

Keywords: Parkinson's disease; dopamine; effort learning; modeling; reinforcement learning; reward learning.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Behavioral tasks used in Study 1 (A) and Study 2 (B). Successive screenshots from left to right illustrate the timing of stimuli and responses for one example trial. When interrogation dots appeared on screen, subjects had to choose between right and left options. Each option was associated with both monetary reward and physical effort. For the chosen option only (left in the example), reward level was indicated by an image of corresponding coin and effort level by the horizontal bar on the thermometer. At the GO! signal, subjects had to squeeze the chosen handgrip (left in the example) until the red fluid level reached the horizontal bar. At that moment, subjects were notified that the reward was added to their cumulative payoff. Changes between the two studies relate to the symbolic cues (one per condition in Study 1 vs one per option in Study 2), the timing (subjects had to wait for a fixed delay before responding in Study 1, whereas they could choose and squeeze as soon as cues and outcomes were displayed in Study 2), and the reward levels (10¢ and 20¢ in Study 1 vs 10¢ and 50¢ in Study 2). Effort levels (20% and 80% of maximal force) were unchanged. Bar graphs on the right illustrate the contingencies between symbolic cues and both reward (red) and effort (blue) outcomes (left and right graphs). Bars indicate the probability of getting high reward/effort outcomes (or one minus the probability of getting low reward/effort outcomes) for left and right options (empty and filled bars, below and above the x-axis). In Study 1, there were four different contingency sets cued by four different symbols (AD), whereas in Study 2, there were only two contingency sets cued by two different pairs of symbolic cues (A, C). In each contingency set, one dimension (reward or effort) was fixed such that learning was only possible for the other dimension. Reward learning was assessed by red sets (A, B in Study 1; A in Study 2) and effort learning by blue sets (C, D in Study 1; C in Study 2). The illustration only applies to one task session. Contingencies were fully counterbalanced across the four sessions.
Figure 2.
Figure 2.
Behavioral results of Study 1 (A) and Study 2 (B). Left, Learning curves show cumulative scores, that is, money won (big minus small reward) in the reward context (empty and filled red circles for OFF and ON medication states, respectively) and effort avoided (low minus high effort outcomes) in the effort context (empty and filled blue circles for OFF and ON states, respectively). Shaded areas represent trial-by-trial intersubject SEM. Lines indicate linear regression fit. Right, Bar graphs show mean correct response rates (same color coding as for the learning curves, with gray bars for the control group in Study 2). Dotted lines correspond to chance-level performance. Error bars indicate ± intersubject SEM. Stars indicate significant main effects of treatment and interaction with learning condition (p < 0.05). CON, Control; EL, effort learning; RL, reward learning.
Figure 3.
Figure 3.
Model comparison based on data from Study 1 (A) and Study 2 (B). Top, Bayesian comparisons of model families. For each parameter, two families corresponding to two halves of the model space were compared: all models including versus all models excluding the possibility of medication effect on the considered parameter. Bars show XP) obtained for the presence of medication effect on six model parameters: kR, kE, αR, αE, γ, and β. Note that the XP for the null hypothesis (absence of medication effect) is simply one minus that shown on the graph. Red dotted line corresponds to significance threshold (exceedance probability of 0.95). Insert, Mean posterior estimate for reward sensitivity parameter kR ± intersubject SEM. Star indicates significant difference from one (p < 0.05). Bottom, Scatter plots of interpatient correlations between observed correct choices and correct choices predicted from the kR-only model for the two learning conditions and medication states. Each dot represents one subject. Shaded areas indicate 95% confidence intervals on linear regression estimates.
Figure 4.
Figure 4.
Model simulations for Study 1 (A) and Study 2 (B). Left, Simulated learning curves show cumulative scores, that is, money won (big minus small reward) in the reward context (empty and filled red circles for OFF and ON medication states, respectively) and effort avoided (low minus high effort outcomes) in the effort context (empty and filled blue circles for OFF and ON states, respectively). Shaded areas represent trial-by-trial intersubject SEM. Lines indicate linear regression fit. Right, Bar graphs show mean correct response rates (same color coding as for the learning curves, with gray bars for the control group in Study 2). Dotted lines correspond to chance-level performance. Error bars indicate ± intersubject SEM. Stars indicate significant main effects of treatment and interaction with learning condition (p < 0.05). EL, Effort learning; RL, reward learning.

References

    1. Alnæs D, Sneve MH, Espeseth T, Endestad T, van de Pavert SH, Laeng B (2014) Pupil size signals mental effort deployed during multiple object tracking and predicts brain activity in the dorsal attention network and the locus coeruleus. J Vis 14: pii: 1. 10.1167/14.4.1 - DOI - PubMed
    1. Arsenault JT, Rima S, Stemmann H, Vanduffel W (2014) Role of the primate ventral tegmental area in reinforcement and motivation. Curr Biol 24:1347–1353. 10.1016/j.cub.2014.04.044 - DOI - PMC - PubMed
    1. Bardgett ME, Depenbrock M, Downs N, Points M, Green L (2009) Dopamine modulates effort-based decision making in rats. Behav Neurosci 123:242–251. 10.1037/a0014625 - DOI - PMC - PubMed
    1. Bayer HM, Glimcher PW (2005) Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47:129–141. 10.1016/j.neuron.2005.05.020 - DOI - PMC - PubMed
    1. Bickel WK, Pitcock JA, Yi R, Angtuaco EJ (2009) Congruence of BOLD response across intertemporal choice conditions: fictive and real money gains and losses. J Neurosci 29:8839–8846. 10.1523/JNEUROSCI.5319-08.2009 - DOI - PMC - PubMed

Publication types