Temporal discounting of reward and the cost of time in motor control

Reza Shadmehr¹, Jean Jacques Orban de Xivry, Minnan Xu-Wilson, Ting-Yu Shih

Affiliations

PMID: 20685993
PMCID: PMC2926660
DOI: 10.1523/JNEUROSCI.1343-10.2010

Temporal discounting of reward and the cost of time in motor control

Reza Shadmehr et al. J Neurosci. 2010.

. 2010 Aug 4;30(31):10507-16.

doi: 10.1523/JNEUROSCI.1343-10.2010.

Authors

Reza Shadmehr¹, Jean Jacques Orban de Xivry, Minnan Xu-Wilson, Ting-Yu Shih

Affiliation

¹ Department of Biomedical Engineering, The Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA. shadmehr@jhu.edu

PMID: 20685993
PMCID: PMC2926660
DOI: 10.1523/JNEUROSCI.1343-10.2010

Abstract

Why do movements take a characteristic amount of time, and why do diseases that affect the reward system alter control of movements? Suppose that the purpose of any movement is to position our body in a more rewarding state. People and other animals discount future reward as a hyperbolic function of time. Here, we show that across populations of people and monkeys there is a correlation between discounting of reward and control of movements. We consider saccadic eye movements and hypothesize that duration of a movement is equivalent to a delay of reward. The hyperbolic cost of this delay not only accounts for kinematics of saccades in adults, it also accounts for the faster saccades of children, who temporally discount reward more steeply. Our theory explains why saccade velocities increase when reward is elevated, and why disorders in the encoding of reward, for example in Parkinson's disease and schizophrenia, produce changes in saccade. We show that delay of reward elevates the cost of saccades, reducing velocities. Finally, we consider coordinated movements that include motion of eyes and head and find that their kinematics is also consistent with a hyperbolic, reward-dependent cost of time. Therefore, each voluntary movement carries a cost because its duration delays acquisition of reward. The cost depends on the value that the brain assigns to stimuli, and the rate at which it discounts this value in time. The motor commands that move our eyes reflect this cost of time.

PubMed Disclaimer

Figures

**Figure 1.**
The cost of a saccade. Here, two forms of temporal discounting are considered: quadratic and hyperbolic. A, For a 20° saccade, the cost in Equation 5 is plotted as a function of movement duration p. Both quadratic and hyperbolic costs of time can produce a total cost that has a minimum at ∼85 ms. B, Expected value of the cost as a function of movement duration. Each curve is the cost for a movement of constant amplitude. The curves are drawn for saccade amplitudes in the range 10–80°, by intervals of 10°. The tick marks near the x-axis are the optimal durations (i.e., the movement durations that produce minimum cost). For quadratic cost of time, movement durations get closer to each other as movement amplitude increases. For a hyperbolic cost, the durations get farther apart as amplitudes increase. Quadratic discount parameter: α = 5.75 × 10⁴. Hyperbolic discount parameters: α = 0.8 × 10⁴ and β = 2.5.

**Figure 2.**
Effect of cost of time on movement durations. The data points are from Collewijn et al. (1988). The dashed line, also from Collewijn et al. (1988), is a good predictor of saccade durations in the range of 5–30° but underestimates durations for larger amplitudes. Quadratic *J_p* = αp ² or linear *J_p* = αp costs cannot account for the fact that saccade durations increase faster than linearly as a function of saccade amplitudes. The shaded areas along each curve represent the effect of changing stimulus value α by ±20%. The hyperbolic discounting not only accounts for the faster than linear increase in durations but also for the variability in this relationship: as stimulus value α changes, it has little effect on saccade durations for short amplitudes, but a greater effect for large amplitudes. Quadratic: α = 5.75 × 10⁴. Linear: α = 1.2 × 10⁴. Hyperbolic: α = 0.8 × 10⁴ and β = 2.5. The red error bars are SD.

**Figure 3.**
Change in the reward discount function predicts change in saccade velocities. The lines are simulation results, and the numbers refer to data from previous publications. For each line, the stimulus value α was kept constant. A, Saccade velocities in Parkinson's disease and healthy controls from data in the studies by Shibasaki et al. (1979), Collewijn et al. (1988), White et al. (1983), Blekher et al. (2000), and Nakamura et al. (1991). Reducing the stimulus value decreases saccade speeds. The changes in saccade speeds are bigger for large-amplitude saccades than small amplitudes. Parameter values are as follows: α = 0.52 × 10⁴ to 1.08 × 10⁴ and β = 2.5. B, Saccade velocities in children and young adults. Increasing the rate of discounting of reward (α in Eq. 6) by a factor of 2 produces saccade velocities that are similar to those seen in children. The data are from the studies by Fioravanti et al. (1995), Collewijn et al. (1988), and White et al. (1983). Parameter values are as follows: children, α = 2.16 × 10⁴; adults, α = 1.08 × 10⁴, β = 2.5. C, Saccade velocities in adult humans and rhesus monkeys. The dashed line represents simulations for which a rhesus monkey eye plant was combined with a human temporal discount function. The black line represents simulations for which a monkey eye plant was combined with a monkey temporal discount function (α = 6.5 × 10⁴). For the human simulations, α = 1.08 × 10⁴. The data on monkey saccades are from the study by Freedman (2008).

**Figure 4.**
Delaying the stimulus discounts stimulus value. A, Experimental paradigm. Volunteers were asked to look at a stimulus, but after saccade initiation, the stimulus was removed. The stimulus was redisplayed at time Δ after saccade end. B, The black line is the theoretical estimate of reward prediction error (Eq. 17). Parameter values are as follows: α = 1.08 × 10⁴, β = 2.5. Saccade duration is p = 110 ms. The data points are experimental results, showing within-subject change in peak saccade velocity with respect to the no-delay condition. The changes in saccade velocity are proportional to reward prediction error. C, Within-subject change in saccade amplitudes were uncorrelated with feedback delay. The horizontal and vertical error bars are SEM.

**Figure 5.**
Kinematic characteristics of gaze appear consistent with a hyperbolic temporal discounting of reward. A, Simulation results for a gaze change to a target at 45°. Both the eyes and the head contribute to the gaze change, with the eyes leading the movement. B, Displacement of eye and head as a function of gaze amplitude. The gray region represents data from the study by Goossens and Van Opstal (1997). C, Peak gaze velocity as a function of gaze amplitude for three forms of temporal discounting. The data points are from the study by Epelboim et al. (1997). Parameter values are as follows: hyperbolic, α = 1.35 × 10⁴ and β = 2.5; linear, α = 2.6 × 10⁴; quadratic, α = 2.0 × 10⁵. D, Gaze duration as a function of gaze amplitude. Parameters are same as in C. E, Effect of context on gaze velocities. The data points are from the study by Epelboim et al. (1997). The gray data points correspond to the tap task in which volunteers looked at the target that they were reaching for, and the black data points correspond to the look task, in which they only looked at the target. Simulation results of the hyperbolic model are shown by the lines. The stimulus value α was increased from α = 1.25 × 10⁴ to α = 2.45 × 10⁴.

See this image and copyright information in PMC

References

1. Alessi SM, Petry NM. Pathological gambling severity is associated with impulsivity in a delay discounting procedure. Behav Processes. 2003;64:345–354. - PubMed
1. Bickel WK, Odum AL, Madden GJ. Impulsivity and cigarette smoking: delay discounting in current, never, and ex-smokers. Psychopharmacology (Berl) 1999;146:447–454. - PubMed
1. Bizzi E. The coordination of eye-head movements. Sci Am. 1974;231:100–106. - PubMed
1. Blekher T, Siemers E, Abel LA, Yee RD. Eye movements in Parkinson's disease: before and after pallidotomy. Invest Ophthalmol Vis Sci. 2000;41:2177–2183. - PubMed
1. Chen-Harris H, Joiner WM, Ethier V, Zee DS, Shadmehr R. Adaptive control of saccades via internal feedback. J Neurosci. 2008;28:2804–2813. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

R01 NS037422/NS/NINDS NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Temporal discounting of reward and the cost of time in motor control

Affiliation

Temporal discounting of reward and the cost of time in motor control

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources