. 2018 Aug:68:102-113.

doi: 10.1016/j.neurobiolaging.2018.04.006. Epub 2018 Apr 19.

Age affects reinforcement learning through dopamine-based learning imbalance and high decision noise-not through Parkinsonian mechanisms

Ravi B Sojitra¹, Itamar Lerner², Jessica R Petok³, Mark A Gluck⁴

Affiliations

¹ Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, Newark, NJ, USA; Department of Mathematics and Computer Science, Rutgers University, Newark, Newark, NJ, USA. Electronic address: ravisoji@gmail.com.
² Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, Newark, NJ, USA. Electronic address: itamar.lerner@gmail.com.
³ Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, Newark, NJ, USA; Department of Psychology, St. Olaf-College, Northfield, MN, USA.
⁴ Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, Newark, NJ, USA. Electronic address: gluck@newark.rutgers.edu.

PMID: 29778803
PMCID: PMC5993631
DOI: 10.1016/j.neurobiolaging.2018.04.006

Age affects reinforcement learning through dopamine-based learning imbalance and high decision noise-not through Parkinsonian mechanisms

Ravi B Sojitra et al. Neurobiol Aging. 2018 Aug.

. 2018 Aug:68:102-113.

doi: 10.1016/j.neurobiolaging.2018.04.006. Epub 2018 Apr 19.

Authors

Ravi B Sojitra¹, Itamar Lerner², Jessica R Petok³, Mark A Gluck⁴

Affiliations

¹ Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, Newark, NJ, USA; Department of Mathematics and Computer Science, Rutgers University, Newark, Newark, NJ, USA. Electronic address: ravisoji@gmail.com.
² Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, Newark, NJ, USA. Electronic address: itamar.lerner@gmail.com.
³ Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, Newark, NJ, USA; Department of Psychology, St. Olaf-College, Northfield, MN, USA.
⁴ Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, Newark, NJ, USA. Electronic address: gluck@newark.rutgers.edu.

PMID: 29778803
PMCID: PMC5993631
DOI: 10.1016/j.neurobiolaging.2018.04.006

Abstract

Probabilistic reinforcement learning declines in healthy cognitive aging. While some findings suggest impairments are especially conspicuous in learning from rewards, resembling deficits in Parkinson's disease, others also show impairments in learning from punishments. To reconcile these findings, we tested 252 adults from 3 age groups on a probabilistic reinforcement learning task, analyzed trial-by-trial performance with a Q-reinforcement learning model, and correlated both fitted model parameters and behavior to polymorphisms in dopamine-related genes. Analyses revealed that learning from both positive and negative feedback declines with age but through different mechanisms: when learning from negative feedback, older adults were slower due to noisy decision-making; when learning from positive feedback, they tended to settle for a nonoptimal solution due to an imbalance in learning from positive and negative prediction errors. The imbalance was associated with polymorphisms in the DARPP-32 gene and appeared to arise from mechanisms different from those previously attributed to Parkinson's disease. Moreover, this imbalance predicted previous findings on aging using the Probabilistic Selection Task, which were misattributed to Parkinsonian mechanisms.

Keywords: Aging; Dopamine; Parkinson's disease; Q-learning; Reinforcement learning.

PubMed Disclaimer

Conflict of interest statement

Disclosure Statement

The authors confirm that there are no known conflicts of interest associated with the publication of this manuscript.

Figures

**Fig. 1**
Stimuli and Feedback Conditions.

**Fig. 2**
Behavioral Task and Results for the 252 paricipants. (a) Experimental task. Participants learned to classify 4 stimuli to one of two arbitrary categories (Rain or Sun) by trial and error using probabilistic feedback. Two stimuli yielded positive feedback (smiling face and positive points) on 90% of the trials and no feedback on 10% of the trials. The other two stimuli yielded negative feedback (frowning face and negative points) on 90% of the trials and no feedback on 10% of the trials. Two example trials are presented. (b) Learning curves for the positive and negative feedback conditions. Error bars illustrate standard errors of the means (see Supplementary Information for additional analyses). (c) Distributions of scores on Block 4 for the different age groups. y-axis represents the percent of individual response-sequences within an age group and the x-axis marks each decile of performance score. The scores for stimulus A and stimulus B of each feedback condition were computed and counted separately to avoid the score of one interfering with the other (for example, when one stimulus receives a perfect score and the other zero, they average to a misleading “random chance” score of 0.5). (d) Left, deviance from chance performance, indicating the *degree* of learning a solution irrespective of the *type* of solution. Right, percent of optimal solutions given convergence to any solution. Convergence to a solution was defined as at least three consecutive blocks with accuracy reaching higher than 0.9, or lower than 0.1, for the optimal and non-optimal solutions, respectively). *** p<0.0001.

**Fig. 3**
Analysis of fitted parameters (for 250 participants). (a) Left, average values of the four parameters fit to the model, by age group. Right, average Learning Rate Imbalance measure, by age group. * - p < 0.02; ** - p < 0.008; *** - p < 0.0001; † - trend; n.s - not significant. Error bars illustrate standard errors of the means. (b) Correlations of behavioral performance meaasures with Decision Noise (top row) and Learning Rate Imbalance (bottom row) across all participants. Small amount of gaussian noise (SD = 0.01) was added to the scatter plots’ datapoints to improve visualisation. *** - p < 0.0001; n.s - not significant. (c) Left, 3D Scatter of three individually-fit parameters: α⁺, α⁻ and R₀, for all participants in the study. Each dot represents one participant. Projection of each dot on the X–Y plane is marked by a small grey dot to allow easier understanding of the 3D scatter. Right, 2D projections of the 3D scatter plot, on two different planes, separately for each age group. Red: Participants that learned a non-optimal solution for at least one of the positive-feedback stimuli (‘Non-optimal performers’). Blue: rest of participants. Small amount of gaussian noise (SD=0.01) was added to the datapoints to improve visualisation.

**Fig. 4**
Simulation of the Probabilistic Selection Task using our learned parameters and model (for 250 participants). (a) Simulation results of the Probabilistic Selection Task (upper row) compared to human results (lower row, reprinted from [20] with permission). Error bars illustrate standard errors of the means. Only the younger and older simulated groups are displayed for easier comparison (see Fig. S2 for full plots). Left, average difference during training on block 1 between the probability of re-selecting the response that was rewarded on the preceding trial, compared to the probability of shifting the response from the one punished on the preceding trial. Younger adults showed higher difference than older adults (Age × Preference: [F(2,247)=13.199, p<0.0001]; pairwise comparisons for younger vs. older: p<0.0001). Middle, Learning Bias changes. Average performance at test on novel pairings of stimuli that were previously mostly rewarded (‘Choose A’) compared to novel pairings of stimuli that were previously mostly punished (‘Avoid B’). Younger adults had a higher difference between the two than older adults (Age X Preference: [F(2,247)=14.257, p<0.0001]; pairwise comparisons for younger vs. older: p<0.04). Right, learning biases (defined as the difference between ‘Choose A’ and ‘Avoid B’) for all participants, ordered by bias values. Whereas younger adults had many more individuals with a positive learning bias than negative learning bias, the numbers were more evenly distributed in the older group. (b) Left, learning bias as a function of the Learning Rate Imbalance, showing a strong negative correlation (r(248)=0.61, p<0.0001). Right, learning bias as a function of the learning rate disparity, showing an inverted U-shape. Low learning bias is achieved either with very low disparity values (in line with ‘harm avoidant’ learning pattern previously hypothesized to characterize PD patients) or with very high disparity values (in line with ‘reward-seeking’ learning pattern, which most older adults in our study actually belonged to; see Fig. 3c).

**Fig. 5**
Effects of DARPP-32 polymorphisms on reward learning and Learning Rate Imbalance (LRI) for 212 participants whose genetic data was available. Error bars illustrate standard errors of the means.

See this image and copyright information in PMC

Comment in

How age affects reinforcement learning.
Lerner I, Sojitra R, Gluck M. Lerner I, et al. Aging (Albany NY). 2018 Nov 12;10(12):3630-3631. doi: 10.18632/aging.101649. Aging (Albany NY). 2018. PMID: 30418934 Free PMC article. No abstract available.

References

1. Backman L, Nyberg L, Lindenberger U, Li S-C, Farde L. The correlative triad among aging, dopamine, and cognition. Neurosci. Biobehav. Rev. 2006;3:791–807. - PubMed
1. Bodi N, Keri S, Nagy H, Moustafa A, Myers CE, Daw N, Dibo G, Takats A, Bereczki D, Gluck MA. Reward-learning and the novelty-seeking personality: a between-and within-subjects study of the effects of dopamine agonists on young Parkinson’s patients. Brain. 2009;132:2385–2395. - PMC - PubMed
1. Bohnen NI, Muller ML, Kuwabara H, Cham R, Constantine GM, Studenski SA. Age-associated striatal dopaminergic denervation and falls in community-dwelling subjects. J. Rehab. Res. Dev. 2009;46:1045–1052. - PMC - PubMed
1. Calabresi P, Gubellini P, Centonze D, Picconi B, Bernardi G, Chergui K, Svenningsson P, Fienberg AA, Greengard P. Dopamine and cAMP-regulated phosphoprotein 32 kDa controls both striatal and long-term depression and long-term potentiation, opposing forms of synaptic plasticity. J. Neurosci. 2000;22:8443–8451. - PMC - PubMed
1. Cavanagh J, Masters SE, Bath K, Frank MJ. Conflict acts as an implicit cost in reinforcement learning. Nature Communications. 2014;5 Article 5394. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

Grants and funding

R03 AG044610/AG/NIA NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Age affects reinforcement learning through dopamine-based learning imbalance and high decision noise-not through Parkinsonian mechanisms

Affiliations

Age affects reinforcement learning through dopamine-based learning imbalance and high decision noise-not through Parkinsonian mechanisms

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Comment in

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical