Reward boosts reinforcement-based motor learning

Pierre Vassiliadis^{1

2}, Gerard Derosiere¹, Cecile Dubuc¹, Aegryan Lete¹, Frederic Crevecoeur^{1

3}, Friedhelm C Hummel^{2

4

5}, Julie Duque¹

Affiliations

¹ Institute of Neuroscience, Université Catholique de Louvain, 53, Avenue Mounier, Brussels 1200, Belgium.
² Defitech Chair for Clinical Neuroengineering, Center for Neuroprosthetics (CNP) and Brain Mind Institute (BMI), Swiss Federal Institute of Technology (EPFL), Geneva 1202, Switzerland.
³ Institute of Information and Communication Technologies, Electronics and Applied Mathematics, Université Catholique de Louvain, Louvain-la-Neuve 1348, Belgium.
⁴ Defitech Chair for Clinical Neuroengineering, Center for Neuroprosthetics (CNP) and Brain Mind Institute (BMI), Swiss Federal Institute of Technology Sion (EPFL), Sion 1951, Switzerland.
⁵ Clinical Neuroscience, University of Geneva Medical School (HUG), Geneva 1202, Switzerland.

PMID: 34345810
PMCID: PMC8319366
DOI: 10.1016/j.isci.2021.102821

Reward boosts reinforcement-based motor learning

Pierre Vassiliadis et al. iScience. 2021.

. 2021 Jul 7;24(7):102821.

doi: 10.1016/j.isci.2021.102821. eCollection 2021 Jul 23.

Authors

Pierre Vassiliadis^{1

2}, Gerard Derosiere¹, Cecile Dubuc¹, Aegryan Lete¹, Frederic Crevecoeur^{1

3}, Friedhelm C Hummel^{2

4

5}, Julie Duque¹

Affiliations

¹ Institute of Neuroscience, Université Catholique de Louvain, 53, Avenue Mounier, Brussels 1200, Belgium.
² Defitech Chair for Clinical Neuroengineering, Center for Neuroprosthetics (CNP) and Brain Mind Institute (BMI), Swiss Federal Institute of Technology (EPFL), Geneva 1202, Switzerland.
³ Institute of Information and Communication Technologies, Electronics and Applied Mathematics, Université Catholique de Louvain, Louvain-la-Neuve 1348, Belgium.
⁴ Defitech Chair for Clinical Neuroengineering, Center for Neuroprosthetics (CNP) and Brain Mind Institute (BMI), Swiss Federal Institute of Technology Sion (EPFL), Sion 1951, Switzerland.
⁵ Clinical Neuroscience, University of Geneva Medical School (HUG), Geneva 1202, Switzerland.

PMID: 34345810
PMCID: PMC8319366
DOI: 10.1016/j.isci.2021.102821

Abstract

Besides relying heavily on sensory and reinforcement feedback, motor skill learning may also depend on the level of motivation experienced during training. Yet, how motivation by reward modulates motor learning remains unclear. In 90 healthy subjects, we investigated the net effect of motivation by reward on motor learning while controlling for the sensory and reinforcement feedback received by the participants. Reward improved motor skill learning beyond performance-based reinforcement feedback. Importantly, the beneficial effect of reward involved a specific potentiation of reinforcement-related adjustments in motor commands, which concerned primarily the most relevant motor component for task success and persisted on the following day in the absence of reward. We propose that the long-lasting effects of motivation on motor learning may entail a form of associative learning resulting from the repetitive pairing of the reinforcement feedback and reward during training, a mechanism that may be exploited in future rehabilitation protocols.

Keywords: Behavioral neuroscience; Cognitive neuroscience; Neuroscience; Sensory neuroscience.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
The motor skill learning task (A) Time course of a trial in the motor skill learning task. Each trial started with the appearance of a sidebar and a target. After a variable preparatory phase (800-1000ms), a cursor appeared in the sidebar, playing the role of a “Go” signal. At this moment, participants were required to pinch the force transducer to bring the cursor into the target as quickly as possible and maintain it there until the end of the task (2000ms). Notably, on most trials, the cursor disappeared halfway toward the target (as displayed here). Then, a reinforcement feedback was provided in the form of a colored circle for 1000ms and provided binary knowledge of performance (Success or Failure in Block_-SR and Block_-SRR) or was non-informative (Block-S). The reinforcement feedback was determined based on the comparison between the Error on the trial and the individual success threshold (computed in the Calibration block, see STAR Methods). Finally, each trial ended with a reminder of the color/feedback association and potential reward associated to good performance (1500ms). (B) Experimental procedure. On Day 1, all participants performed two familiarization blocks in a Block_-SR condition. The first one involved full vision of the cursor while the second one provided only partial vision and served to calibrate the difficulty of the task on an individual basis (See STAR Methods). Then, Pre- and Post-training Block_-SR assessments were separated by 6 blocks of training in the condition corresponding to each individual group (Block_-S for Group_-S, Block_-SR for Group_-SR and Block_-SRR for Group_-SRR). Day 2 involved a Familiarization block (with partial vision) followed by a Re-test assessment (4 Block_-SR pooled together). There was no recalibration on Day 2. (C) Example of a force profile. Force applied (in % of MVC) during the task. Participants were asked to approximate the Target_Force as quickly and accurately as possible to minimize the Error (gray shaded area). As shown on the Figure, this Error depended on the speed of force initiation (Force_Initiation) and on the accuracy of the maintained force, as reflected by its amplitude with respect to the Target_Force (Force_AmplError) and its variability (Force_Variability). Note that the first 150ms of each trial were not considered for the computation of the Error.

**Figure 2**
Effect of reward on motor skill learning (A) Error. Average Error is represented across practice for the three experimental groups (gray: Group_-S, light green: Group_-SR, dark green: Group_-SRR). The gray shaded area highlights the blocks concerned by the reinforcement manipulation. The remaining blocks were performed with knowledge of performance only (*i.e.,* in a Block_-SR setting). (B) Skill learning. Bar plot (left) and violin plot (right, each dot = one subject) representing skill learning (quantified as the Error in Post-training blocks expressed in percentage of Pre-training blocks) in the three experimental groups. Skill learning was significantly enhanced in Group_-SRR compared to the two other groups. This result remained significant when removing the subject showing an extreme value in the Group_-SR (ANOVA: F_(2,86) = 6.44, p = 0.0025, partial η² = 0.13; post-hocs; Group-SRR vs. Group-SR: p = 0.027; Group-SRR vs. Group-S: p = 0.00064; Group-SR vs. Group-S: p = 0.21). (C) Skill maintenance. Bar plot (left) and violin plot (right) representing skill maintenance quantified as the Error in Re-test blocks expressed in percentage of Pre-training blocks) in the three experimental groups. (D) Success. Proportion of successful trials for each block. (E) Force profiles. Individual force profiles of one representative subject of Group_-S (left), Group_-SR (middle) and Group_-SRR (right) in the Pre- (gray) and Post-training blocks (blue). Note the better approximation of the Target_Force and the reduced inter-trial variability at Post-training in the exemplar subject of Group_-SRR. ∗: significant difference between groups (p<0.05). #: significant difference within a group between normalized Post-training Error and a constant value of 100% (p<0.017 to account for multiple comparisons). Data are represented as mean ± SE

**Figure 3**
Between-trial adjustments in the Error (A) Reinforcement-based adjustments in the Error during Day 1 training. Absolute between-trial adjustments in the Error (Error_BTC = |Error_n+1-Error_n|) according to the reinforcement feedback (*i.e.,* Success or Failure) encountered at trial_n in the three Group_TYPES (gray: Group_-S, light green: Group_-SR, dark green: Group_-SRR). Notably, these bins of trials were constituted based on the success threshold-normalized Error at trial_n in order to compare adjustments in motor commands following trials of similar Error in the three groups. Stars denote significant group differences in Error_BTC for a given outcome (left panel, see STAR Methods). Reinforcement-based adjustments (Error_BTC after Failure in percentage of Error_BTC after Success) were compared in the three Group_TYPES (right panel). (B) Correlations between the magnitude of reinforcement-based adjustments in the Error and the average success rate on the next trial, showing the relevance of these adjustments in the present task. Each dot represents a subject. (C, D) Same for Day 2 training. Note that reinforcement-based adjustments in motor commands remained amplified in Group_SRR, despite the absence of reward on Day 2. (E) Sensory-based adjustments in the Error during Day 1 training. Error_BTC following trials_n with Failures of different Error magnitudes (left panel). Sensory-based adjustments (Error_BTC after Large Failure in percentage of Error_BTC after Small Failure) were compared in the three Group_TYPES (right panel). (F) Correlations between the magnitude of sensory-based adjustments in the Error and the probability of success on the next trial, showing the relevance of these adjustments for task success. (G, H) Same for Day 2 training. ∗: p < 0.05. Data are represented as mean ± SE.

**Figure 4**
Between-trial adjustments in initiation time, amplitude error and variability Reinforcement-based adjustments in the Force_Initiation (A), Force_AmplError (B) and Force_Variability (C). Absolute between-trial changes (BTC) for each motor component (Force_BTC = |Force_n+1-Force_n|) according to the reinforcement feedback (*i.e.,* Success or Failure) encountered at trial_n in the three Group_TYPES (gray: Group_-S, light green: Group_-SR, dark green: Group_-SRR). Notably, these bins of trials were constituted based on the success threshold-normalized Error at trial_n. Stars denote significant group differences in Error_BTC for a given outcome (left panel). Reinforcement-based adjustments (Force_BTC after Failure in percentage of Force_BTC after Success) in the three Group_TYPES (right panel). Sensory-based adjustments in the Force_Initiation (D), Force_AmplError (E) and Force_Variability (F). Force_BTC following trials_n with Failures of different Error magnitudes (left panel). Sensory-based adjustments (Force_BTC after Large Failure in percentage of Force_BTC after Small Failure) in the three Group_TYPES (right panel). ∗: p < 0.05. Data are represented as mean ± SE.

See this image and copyright information in PMC

References

1. Abe M., Schambra H., Wassermann E.M., Luckenbaugh D., Schweighofer N., Cohen L.G. Reward improves long-term retention of a motor memory through induction of offline memory gains. Curr. Biol. 2011;21:557–562. doi: 10.1016/j.cub.2011.02.030. - DOI - PMC - PubMed
1. Avraham G., Taylor J.A., Ivry R.B., McDougle S.D. An associative learning account of sensorimotor adaptation. bioRxiv. 2020 doi: 10.1101/2020.09.14.297143. - DOI - PMC - PubMed
1. Balleine B.W., O’Doherty J.P. Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology. 2010;35:48–69. doi: 10.1038/npp.2009.131. - DOI - PMC - PubMed
1. Barron A.B., Søvik E., Cornish J.L. The roles of dopamine and related compounds in reward-seeking behavior across animal phyla. Front. Behav.Neurosci. 2010;4:1–9. doi: 10.3389/fnbeh.2010.00163. - DOI - PMC - PubMed
1. Berke J.D. What does dopamine mean? Nat. Neurosci. 2018;21:787–793. doi: 10.1038/s41593-018-0152-y. - DOI - PMC - PubMed

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Reward boosts reinforcement-based motor learning

Affiliations

Reward boosts reinforcement-based motor learning

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources