. 2011 Mar;7(3):e1002012.

doi: 10.1371/journal.pcbi.1002012. Epub 2011 Mar 10.

Learning from sensory and reward prediction errors during motor adaptation

Jun Izawa¹, Reza Shadmehr

Affiliations

PMID: 21423711
PMCID: PMC3053313
DOI: 10.1371/journal.pcbi.1002012

Learning from sensory and reward prediction errors during motor adaptation

Jun Izawa et al. PLoS Comput Biol. 2011 Mar.

. 2011 Mar;7(3):e1002012.

doi: 10.1371/journal.pcbi.1002012. Epub 2011 Mar 10.

Authors

Jun Izawa¹, Reza Shadmehr

Affiliation

¹ Department of Biomedical Engineering, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America. jizawa@jhu.edu

PMID: 21423711
PMCID: PMC3053313
DOI: 10.1371/journal.pcbi.1002012

Abstract

Voluntary motor commands produce two kinds of consequences. Initially, a sensory consequence is observed in terms of activity in our primary sensory organs (e.g., vision, proprioception). Subsequently, the brain evaluates the sensory feedback and produces a subjective measure of utility or usefulness of the motor commands (e.g., reward). As a result, comparisons between predicted and observed consequences of motor commands produce two forms of prediction error. How do these errors contribute to changes in motor commands? Here, we considered a reach adaptation protocol and found that when high quality sensory feedback was available, adaptation of motor commands was driven almost exclusively by sensory prediction errors. This form of learning had a distinct signature: as motor commands adapted, the subjects altered their predictions regarding sensory consequences of motor commands, and generalized this learning broadly to neighboring motor commands. In contrast, as the quality of the sensory feedback degraded, adaptation of motor commands became more dependent on reward prediction errors. Reward prediction errors produced comparable changes in the motor commands, but produced no change in the predicted sensory consequences of motor commands, and generalized only locally. Because we found that there was a within subject correlation between generalization patterns and sensory remapping, it is plausible that during adaptation an individual's relative reliance on sensory vs. reward prediction errors could be inferred. We suggest that while motor commands change because of sensory and reward prediction errors, only sensory prediction errors produce a change in the neural system that predicts sensory consequences of motor commands.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Figure 1. Experimental setup.**
(A) In the reaching task, subjects held a handle of a robotic arm and made ‘shooting’ movements to move a cursor through a target at 10 cm. The arm was covered by a screen. During adaptation, the cursor-hand relationship was perturbed so that the cursor position was rotated around the center at the start position. The coordinate system is drawn on the left side of the robot (invisible for subject) where the clockwise rotation around the start is positive. The cumulative score of each block was provided to the subject. In the localization task, subjects pointed with their left hand over the screen to the remembered location of their right hand as it crossed the (unseen) target area in the previous trial. In the localization task, the start box was not visible. (B) Experimental paradigms. In ERR, full visual feedback about the cursor position was provided as well as the animation and the sound indicating target explosion regarding success or failure of the task. In EPE, while the cursor was unseen during the shooting movement, it was presented for 200 ms as the hand crossed an imaginary circle with the radius equal to the target, providing endpoint error with respect to the target. The reward signal was also provided as in the ERR condition. In RWD, no visual feedback about the cursor was provided. All information that subjects were able to use was the success or failure of the task. (C) Reach angles of three representative subjects during the adaptation phase. The yellow line in the ERR group is the ideal reach angle, which shifted gradually up to 8 degrees by the visual rotation. The gray area indicates the reward region, which shifted with the same schedule in the three groups. (D) Reach variability in the final 100 trials for each group. There are the significant differences between ERR and EPE (t-test, p<0.003) as well as between EPE and RWD (t-test, p<0.001). (E) Results of the localization task for the three subjects. The reach trajectory is plotted for the POST condition. Red line is for the RWD subject, blue line is for the ERR subject, and green line is for the EPE subject. The circle around the reach trajectory is the averaged pointing location in the localization trial.

**Figure 2. The sensory remapping and the generalization function.**
(A) The average estimated localization of hand position in PRE and POST conditions. Error bars are SEM. (B) Generalization of adaptation from the learned target direction (at 0°) to neighboring target directions. (C) Illusion index (change in estimated location of the hand from PRE to POST adaptation), as a function of generalization index in subjects in EPE condition. Each dot indicates individual subject's data. There are significant negative correlation in these two indices (R = −0.68, p = 0.02).

**Figure 3. The theoretical problem of learning motor control.**
(A) A generative model of the motor adaptation task. Motor commands are corrupted by a perturbation, which result in a hand position that is sensed via a cursor, and may also result in reward. The objective of the learner is to find the motor commands that maximize reward. White circles are hidden variables and gray circles are observed variables. Arrows indicate conditional probabilities. (B) Model of optimal learner. The learning system is composed of two compensatory mechanisms: action selector and internal forward model. At the trial k, the action selector outputs the motor command to make a transition of the state of the body and task from to . The state variable includes three elements: hand position h, perturbation p, and the position t. The brain observes the part of the state of the body . At the same time, the learner predicts the transition of the body state from the efference copy of the motor command. Kalman filtering correct the prediction to minimize the sensory prediction error to have the updated state . The action selector selects the optimal action as a function of the updated state at the next trial. (C) Sample disturbance and the response of the model. The task is to control the reach angle. Clockwise (CW) direction is positive and the target is at 0°. The uncertainty of the visual feedback was controlled to modulates the Kalman gain. The simulations predict a remapping regarding estimated hand position modulated by the level of visual uncertainty.

formula image — **Figure 3. The theoretical problem of learning motor control.**
(A) A generative model of the motor adaptation task. Motor commands are corrupted by a perturbation, which result in a hand position that is sensed via a cursor, and may also result in reward. The objective of the learner is to find the motor commands that maximize reward. White circles are hidden variables and gray circles are observed variables. Arrows indicate conditional probabilities. (B) Model of optimal learner. The learning system is composed of two compensatory mechanisms: action selector and internal forward model. At the trial k, the action selector outputs the motor command to make a transition of the state of the body and task from to . The state variable includes three elements: hand position h, perturbation p, and the position t. The brain observes the part of the state of the body . At the same time, the learner predicts the transition of the body state from the efference copy of the motor command. Kalman filtering correct the prediction to minimize the sensory prediction error to have the updated state . The action selector selects the optimal action as a function of the updated state at the next trial. (C) Sample disturbance and the response of the model. The task is to control the reach angle. Clockwise (CW) direction is positive and the target is at 0°. The uncertainty of the visual feedback was controlled to modulates the Kalman gain. The simulations predict a remapping regarding estimated hand position modulated by the level of visual uncertainty.

**Figure 4. Estimated contribution of reward and sensory prediction errors to change in motor output during adaptation.**
When subjects experienced the ERR and EPE condition, we assumed that the motor commands were produced by the sum of two memories, , where was updated by the sensory-prediction error and was updated by the reward prediction error. The best fit parameters predict the update of the two memories. The black think line is the averaged subject's reach angle during the adaptation period. The gray shadow is SEM. The superimposed purple line is the estimated reach angle from the model which is a combination of (red) and (blue). In the RWD condition, the motor commands are updated by only the reward-prediction error: .

See this image and copyright information in PMC

Cited by

Continuous reports of sensed hand position during sensorimotor adaptation.
Tsay JS, Parvin DE, Ivry RB. Tsay JS, et al. J Neurophysiol. 2020 Oct 1;124(4):1122-1130. doi: 10.1152/jn.00242.2020. Epub 2020 Sep 9. J Neurophysiol. 2020. PMID: 32902347 Free PMC article.
The Errors of Our Ways: Understanding Error Representations in Cerebellar-Dependent Motor Learning.
Popa LS, Streng ML, Hewitt AL, Ebner TJ. Popa LS, et al. Cerebellum. 2016 Apr;15(2):93-103. doi: 10.1007/s12311-015-0685-5. Cerebellum. 2016. PMID: 26112422 Free PMC article. Review.
Sensory prediction errors, not performance errors, update memories in visuomotor adaptation.
Lee K, Oh Y, Izawa J, Schweighofer N. Lee K, et al. Sci Rep. 2018 Nov 7;8(1):16483. doi: 10.1038/s41598-018-34598-y. Sci Rep. 2018. PMID: 30405177 Free PMC article.
Clustering analysis of movement kinematics in reinforcement learning.
Sidarta A, Komar J, Ostry DJ. Sidarta A, et al. J Neurophysiol. 2022 Feb 1;127(2):341-353. doi: 10.1152/jn.00229.2021. Epub 2021 Dec 22. J Neurophysiol. 2022. PMID: 34936514 Free PMC article.
Feedback Modulates Audio-Visual Spatial Recalibration.
Kramer A, Röder B, Bruns P. Kramer A, et al. Front Integr Neurosci. 2020 Jan 17;13:74. doi: 10.3389/fnint.2019.00074. eCollection 2019. Front Integr Neurosci. 2020. PMID: 32009913 Free PMC article.

See all "Cited by" articles

References

1. Synofzik M, Thier P, Lindner A. Internalizing agency of self-action: perception of one's own hand movements depends on an adaptable prediction about the sensory action outcome. J Neurophysiol. 2006;96:1592–1601. - PubMed
1. Synofzik M, Lindner A, Thier P. The cerebellum updates predictions about the visual consequences of one's behavior. Curr Biol. 2008;18:814–818. - PubMed
1. Baddeley RJ, Ingram HA, Miall RC. System identification applied to a visuomotor task: near-optimal human performance in a noisy changing task. J Neurosci. 2003;23:3066–3075. - PMC - PubMed
1. Berniker M, Kording K. Estimating the sources of motor errors for adaptation and generalization. Nat Neurosci. 2008;11:1454–1461. - PMC - PubMed
1. Kording KP, Tenenbaum JB, Shadmehr R. The dynamics of memory as a consequence of optimal adaptation to a changing body. Nat Neurosci. 2007;10:779–786. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Medical
- ClinicalTrials.gov
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Learning from sensory and reward prediction errors during motor adaptation

Affiliation

Learning from sensory and reward prediction errors during motor adaptation

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Research Materials

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Research Materials