Processing speed enhances model-based over model-free reinforcement learning in the presence of high working memory functioning

Affiliations

¹ Department of Psychiatry and Psychotherapy, Charité Universitätsmedizin Berlin Berlin, Germany.
² Department of Psychiatry and Psychotherapy, University Hospital Carl Gustav Carus, Technische Universität Dresden Dresden, Germany.
³ Department of Psychiatry and Neuroimaging Center, Technische Universität Dresden Dresden, Germany.
⁴ Institute of Behavioural Neuroscience, University College London London, UK.
⁵ Department of Psychiatry and Psychotherapy, Charité Universitätsmedizin Berlin Berlin, Germany ; Area of Excellence Cognitive Sciences, University of Potsdam Potsdam, Germany.
⁶ Translational Neuromodeling Unit, Institute of Biomedical Engineering, University of Zurich and Swiss Federal Institute of Technology Zurich, Switzerland ; Department of Psychiatry, Psychosomatics, and Psychotherapy, Hospital of Psychiatry, University of Zurich Zurich, Switzerland.

PMID: 25566131
PMCID: PMC4269125
DOI: 10.3389/fpsyg.2014.01450

Processing speed enhances model-based over model-free reinforcement learning in the presence of high working memory functioning

Daniel J Schad et al. Front Psychol. 2014.

. 2014 Dec 17:5:1450.

doi: 10.3389/fpsyg.2014.01450. eCollection 2014.

Affiliations

¹ Department of Psychiatry and Psychotherapy, Charité Universitätsmedizin Berlin Berlin, Germany.
² Department of Psychiatry and Psychotherapy, University Hospital Carl Gustav Carus, Technische Universität Dresden Dresden, Germany.
³ Department of Psychiatry and Neuroimaging Center, Technische Universität Dresden Dresden, Germany.
⁴ Institute of Behavioural Neuroscience, University College London London, UK.
⁵ Department of Psychiatry and Psychotherapy, Charité Universitätsmedizin Berlin Berlin, Germany ; Area of Excellence Cognitive Sciences, University of Potsdam Potsdam, Germany.
⁶ Translational Neuromodeling Unit, Institute of Biomedical Engineering, University of Zurich and Swiss Federal Institute of Technology Zurich, Switzerland ; Department of Psychiatry, Psychosomatics, and Psychotherapy, Hospital of Psychiatry, University of Zurich Zurich, Switzerland.

PMID: 25566131
PMCID: PMC4269125
DOI: 10.3389/fpsyg.2014.01450

Abstract

Theories of decision-making and its neural substrates have long assumed the existence of two distinct and competing valuation systems, variously described as goal-directed vs. habitual, or, more recently and based on statistical arguments, as model-free vs. model-based reinforcement-learning. Though both have been shown to control choices, the cognitive abilities associated with these systems are under ongoing investigation. Here we examine the link to cognitive abilities, and find that individual differences in processing speed covary with a shift from model-free to model-based choice control in the presence of above-average working memory function. This suggests shared cognitive and neural processes; provides a bridge between literatures on intelligence and valuation; and may guide the development of process models of different valuation components. Furthermore, it provides a rationale for individual differences in the tendency to deploy valuation systems, which may be important for understanding the manifold neuropsychiatric diseases associated with malfunctions of valuation.

Keywords: cognitive abilities; decision-making; fluid intelligence; habitual and goal-directed system; model-based and model-free learning; reward.

PubMed Disclaimer

Figures

**Figure 1**
**(A)** Trial structure: Step 1 consisted of a choice between two abstract gray stimuli. The unchosen stimulus faded away while the chosen stimulus was highlighted with a red frame and moved to the top of the screen, where it remained visible for 1.5 s. In Step 2 a second, colored, stimulus pair appeared. Step 2 choices resulted either in a win of 20 Cents or no win. **(B)** Transition structure: Each first stage stimulus led to one, fixed, second stage pair in 70% of the trials (*common transition*), and to the other second stage stimulus pair in 30% of the trials (*rare transition*). Reinforcement probabilities for each second stage stimulus changed slowly and independently between 25% and 75% according to Gaussian random walks with reflecting boundaries (Daw et al., 2011). Win probabilities, P (reward), are displayed as a function of trial number. **(C)** Model predictions: Predictions from the computational model (Daw et al., 2011) based on the model-free (left panel) vs. model-based (right panel) system for the probability to repeat the choice from the previous trial as a function of reward (rew., rewarded; unrew., unrewarded) and transition type at the previous trial. Model-free choice predicts a main effect of *reward*, and no effect of *transition*. Model-based choice predicts an interaction of *transition × reward*. Figure partly adapted from Sebold et al. (2014).

**Figure 2**
**(A–C)** Choice repetition probabilities: Average proportion of trials on which participants repeated their previous choice, as a function of outcome (reward vs. no reward) and transition (common vs. rare) at the previous trial. Results are presented for individuals with a *low* (A, 35–59), *medium* (B, 59–75), and *high* (C, 76–98) performance score on the *Digit Symbol Substitution Test* (*DSST*). Error bars are subject-based standard errors of the means. **(D–E)** Individual reward and transition effects and DSST performance: Individual estimates of the main effect of *reward* (= rewarded − unrewarded; D) and the *reward* × transition interaction (= rewarded common − rewarded rare − unrewarded common + unrewarded rare; E) on repetition-probabilities (*p_repeat*: repetition = 1, switch = 0) as a function of individual *DSST* scores. Lines show the estimated quadratic **(D)** and linear **(E)** effects with 95% confidence intervals.

**Figure 3**
Individual parameter estimates and DSST performance: Maximum posterior parameter values of the dual-system reinforcement learning model for each participant as a function of performance on the *Digit Symbol Substitution Test* (*DSST*) are displayed. The lines represent predictions from linear regressions of each model parameter on DSST scores, with 95% confidence intervals (CI). **(A–D)** Regression lines and CI in unbounded fitting-space were transformed to model-space for plotting by passing them through the inverse-logit function. **(A)** Best-fitting individual parameter values for the weighting parameter ω, which determines the balance between model-free (weight = 0) and model-based (weight = 1) control. **(B)** Regression of best-fitting weighting parameter values on the interaction between DSST scores × working memory span (median-split factor). **(C)** Best-fitting parameter values for the second-stage learning rate α₂. **(D)** The lambda (λ) parameter determines update of model-free step 1 action values by step 2 prediction errors. **(E)** Repetition factor, p, indicates how strongly individuals tend to repeat previous actions.

See this image and copyright information in PMC

References

1. Arbuthnott K., Frank J. (2000). Trail making test, part B as a measure of executive control: validation using a set-switching paradigm. J. Clin. Exp. Neuropsychol. 22, 518–528. 10.1076/1380-3395(200008)22:4;1-0;FT518 - DOI - PubMed
1. Army Individual Test Battery. (1944). Manual of Directions and Scoring. Washington, DC: War Department, Adjutant General's Office.
1. Barr D. J., Levy R., Scheepers C., Tily H. J. (2013). Random effects structure for confirmatory hypothesis testing: keep it maximal. J. Mem. Lang. 68, 255–278. 10.1016/j.jml.2012.11.001 - DOI - PMC - PubMed
1. Bates D., Maechler M., Bolker B. (2013). Linear mixed-Effects Models Using S4 Classes, [Software] Version: 0.999999-2. Available online at: http://www.R-project.org
1. Benjamini Y., Hochberg Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statist. Soc. B 57, 289–300.

Grants and funding

Wellcome Trust/United Kingdom

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Processing speed enhances model-based over model-free reinforcement learning in the presence of high working memory functioning

Affiliations

Processing speed enhances model-based over model-free reinforcement learning in the presence of high working memory functioning

Authors

Affiliations

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources