Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov;12(6):1146-1161.
doi: 10.1177/21677026231213368. Epub 2024 Jan 24.

Reinforcement-Learning-Informed Queries Guide Behavioral Change

Affiliations

Reinforcement-Learning-Informed Queries Guide Behavioral Change

Vanessa M Brown et al. Clin Psychol Sci. 2024 Nov.

Abstract

Algorithmically defined aspects of reinforcement learning correlate with psychopathology symptoms and change with symptom improvement following cognitive-behavioral therapy (CBT). Separate work in nonclinical samples has shown that varying the structure and statistics of task environments can change learning. Here, we combine these literatures, drawing on CBT-based guided restructuring of thought processes and computationally defined mechanistic targets identified by reinforcement-learning models in depression, to test whether and how verbal queries affect learning processes. Using a parallel-arm design, we tested 1,299 online participants completing a probabilistic reward-learning task while receiving repeated queries about the task environment (11 learning-query arms and one active control arm). Querying participants about reinforcement-learning-related task components altered computational-model-defined learning parameters in directions specific to the target of the query. These effects on learning parameters were consistent across depression-symptom severity, suggesting new learning-based strategies and therapeutic targets for evoking symptom change in mood psychopathology.

Keywords: depression; experimental therapeutics; psychopathology; reinforcement learning.

PubMed Disclaimer

Conflict of interest statement

Declaration of Conflicting Interests V. M. Brown has received consulting fees from Aya Technologies. The other authors report no conflicts of interest to disclose.

Figures

Fig. 1.
Fig. 1.
Task design and ratings. (a) A single trial with query in the guided learning task. Participants were presented two stimuli, made a choice, received the outcome for that choice, and on every third trial were then queried about a specific aspect of a stimulus—here, the least ever received for the last chosen option. (b) Participants’ ratings of engagement (first column), interest (second column), and difficulty (third column); ratings were obtained from a Likert-type scale where 0 = not at all and 10 = very much. The first row displays histograms of ratings for all participants, the second row displays relationships between ratings and changes in positive affect during the task, and the third row displays relationships between ratings and changes in negative affect. Each plot in the second and third rows displays a dot per participant, a line of best fit, and Pearson correlation coefficient.
Fig. 2.
Fig. 2.
Learning curves by query arm. Query arms are indicated by color/and or number. (a) Each panel shows learning curves and query accuracy over time for the indicated query arms. The active control arm is plotted in gray on each panel. The top half of each panel shows a running average (last five trials) of choice accuracy (y-axis), defined as the percentage of time the stimulus more likely to lead to a higher monetary outcome is chosen. Each bottom panel shows a running average (last three queries) of query accuracy (y-axis), defined as the signed difference between the actual and correct responses to queries. Here, 0 on the y-axis indicates perfect accuracy, and distance above or below 0 indicates greater query inaccuracy. The x-axis shows the trial number. (b) Overall choice accuracy by query arm. Bars indicate SEM. (c) Overall proportion of switches between options by query arm. Bars indicate SEM. For display purposes, query arm data are grouped by query target or stimulus chosen.
Fig. 3.
Fig. 3.
Changes in learning parameters by query arm relative to active control. Shaded areas represent the posterior distribution of the difference from active control for each query arm. Opaque colors represent distributions significantly different from 0 (95% of posterior above or below 0).
Fig. 4.
Fig. 4.
Relationships among choice accuracy, query accuracy, and symptom severity. On each plot, colored lines and dots indicate query arm (see Table 1). Each dot represents one participant, and colored lines represent the line of best fit per arm. Thick black lines indicate the line of best fit for all participants across all arms. “Depression,” “anxiety,” and “stress severity” refer to Depression Anxiety and Stress Scale depression, anxiety, and stress subscale scores, respectively. (a) Relationship between query and choice accuracy. Across all conditions, more accurate responses on queries (lower distance from correct value) were related to better choice accuracy. (b) Relationship between symptom severity and choice accuracy. Overall, higher symptom severity was related to worse choice accuracy. (c) Relationship between symptom severity and query accuracy. Overall, higher symptom severity was related to worse query accuracy.

References

    1. Antony MM, Bieling PJ, Cox BJ, Enns MW, & Swinson RP (1998). Psychometric properties of the 42-item and 21-item versions of the Depression Anxiety Stress Scales in clinical groups and a community sample. Psychological Assessment, 10(2), 176–181. 10.1037/1040-3590.10.2.176 - DOI
    1. Arditte KA, Demet C, Shaw AM, & Timpano KR (2016). The importance of assessing clinical phenomena in Mechanical Turk research. Psychological Assessment, 28(6), 684–691. 10.1037/pas0000217 - DOI - PubMed
    1. Atlas LY, Doll BB, Li J, Daw ND, & Phelps EA (2016). Instructed knowledge shapes feedback-driven aversive learning in striatum and orbitofrontal cortex, but not the amygdala. eLife, 5, Article e15192. 10.7554/eLife.15192 - DOI - PMC - PubMed
    1. Atlas LY, Sandman CF, & Phelps EA (2022). Rating expectations can slow aversive reversal learning. Psychophysiology, 59(3), Article e13979. 10.1111/psyp.13979 - DOI - PMC - PubMed
    1. Beck JS (2011). Cognitive behavior therapy: Basics and beyond. The Guilford Press.