Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 1;45(1):e0080242024.
doi: 10.1523/JNEUROSCI.0080-24.2024.

Computational and Neural Evidence for Altered Fast and Slow Learning from Losses in Problem Gambling

Affiliations

Computational and Neural Evidence for Altered Fast and Slow Learning from Losses in Problem Gambling

Kiyohito Iigaya et al. J Neurosci. .

Abstract

Learning occurs across multiple timescales, with fast learning crucial for adapting to sudden environmental changes, and slow learning beneficial for extracting robust knowledge from multiple events. Here, we asked if miscalibrated fast vs slow learning can lead to maladaptive decision-making in individuals with problem gambling. We recruited participants with problem gambling (PG; N = 20; 9 female and 11 male) and a recreational gambling control group without any symptoms associated with PG (N = 20; 10 female and 10 male) from the community in Los Angeles, CA. Participants performed a decision-making task involving reward-learning and loss-avoidance while being scanned with fMRI. Using computational model fitting, we found that individuals in the PG group showed evidence for an excessive dependence on slow timescales and a reduced reliance on fast timescales during learning. fMRI data implicated the putamen, an area associated with habit, and medial prefrontal cortex (PFC) in slow loss-value encoding, with significantly more robust encoding in medial PFC in the PG group compared to controls. The PG group also exhibited stronger loss prediction error encoding in the insular cortex. These findings suggest that individuals with PG have an impaired ability to adjust their predictions following losses, manifested by a stronger influence of slow value learning. This impairment could contribute to the behavioral inflexibility of problem gamblers, particularly the persistence in gambling behavior typically observed in those individuals after incurring loss outcomes.

Keywords: decision-making; fMRI; gambling; learning.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Clinical Scales for participants. DSM-IV (left), Alcohol Use Disorders Identification Test (AUDIT; middle), and Fagerstrom Test for Nicotine Dependence (FTND; right). The control group is shown in black and PG group is shown in purple.
Figure 2.
Figure 2.
Task and behavior A, The task. On each trial, a participant was presented with two stimuli, each of which was associated with a unique probability of outcomes. There were three trial types. On trials in the gain condition, the stimuli were associated with potential monetary gain outcomes. On trials in the loss-avoidance condition, the stimuli were associated with potential monetary losses. On trials in the neutral trials, participants received no monetary outcomes. B, Example outcome probabilities in the reward condition. The probabilities of reward associated with the two stimuli (choice A and choice B) are plotted as a function of the trial number. Choice B is initially more rewarding than choice A. After about 15 trials, the reward probabilities are reversed. C, Behavioral performance, measured by the total number of received rewards (losses) divided by the total number of trials in the gain (loss) condition, is shown before and after the reversal points for control and individuals in the PG group.
Figure 3.
Figure 3.
Detailed task behavior. A, Trial-by-trial choice dynamics in the gain condition. The mean (solid) choice probability for the control (black) and PG (purple) is shown across trials in the gain (left) and the loss (right) conditions. The ideal choice probability is 0 before the reversal and 1 after the reversal. The shaded area indicates the SEM. B, The probability of repeating the same choice (good or bad) after receiving no reward (left) or reward (right). The probability is computed separately for control (black; C) and PG (purple; G) groups pre- and post-reversals. The good choice is defined as the choice of the alternative that is associated with a higher gain probability or a smaller loss probability. C, Trial-by-trial choice dynamics in the loss condition. D, The probability of repeating the same choice after receiving a loss (left) or avoiding a loss (right). The group difference was significant for the probability of repeatedly selecting the better target after avoiding loss following reversals (p < 0.01 permutation tests).
Figure 4.
Figure 4.
A computational model describing learning of reward history across multiple timescales captures human behavior on the task. A, Schematic of the computational model. Gain or loss history is integrated over two timescales independently to compute fast and slow values. The two values are then weighted with relative weights to compute an overall decision-value for each condition separately. B, The model’s estimates of relative weights. The relative weights assigned to slowly learned value with respect to the fast value are plotted for the control and PG groups for reward trials (top) and loss trials (loss). There is no significant difference between groups in gain trials, but the individuals with PG show a significantly larger weight than the controls in loss trials (p < 0.001 permutation test). C, Model simulations show that the classic reinforcement learning model does not capture the behavioral data, but the two-timescale model does. From left to right: data, one-timescale model simulation, and two-timescale model simulation. The average monetary gain received per trial is shown before and after reversals for the control (black) and PG (purple). The behavioral effects shown in Figure 2C are captured by the the two-timescale learning model (right) but not the standard RL model (middle).
Figure 5.
Figure 5.
fMRI correlates of the slowly learned value component from the two time-scale computational model. A, A cluster of voxels in the left putamen is significantly correlated with the model’s predicted signal (whole-brain cFWE p < 0.05 with height threshold at p < 0.001. This result comes from an analysis that pooled across the two groups. In addition, a cluster of voxels in the ACC is significantly correlated with the model’s predicted signal (whole-brain cFWE p < 0.05 with height threshold at p < 0.001. This result comes from an analysis that pooled across the two groups. B, A cluster in the left insula was significantly correlated with prediction error signals from the slowlearning component of the two-time-step model (whole-brain cFWE p < 0.05 with height threshold at p < 0.001.
Figure 6.
Figure 6.
Group difference in the fMRI correlates of slow-value learning in loss trials. A, The slow value signal. There was no significant difference between groups in an ROI defined on the putamen cluster identified from the pooled analysis. However, a cluster of voxels in the ACC is significantly correlated with the model’s predicted signal (whole-brain cFWE p < 0.05 with height threshold at p < 0.001. This result comes from an analysis that pooled across the two groups. B, The prediction error signal. The magnitude of correlation was found to be significantly greater for the PG than control groups (p < 0.05 permutation test).

Similar articles

References

    1. Bavard S, Palminteri S (2023) The functional form of value normalization in human reinforcement learning. Elife 12:e83891. 10.7554/eLife.83891 - DOI - PMC - PubMed
    1. Bialek W (2005) “Should You Believe that This Coin is Fair?” arXiv preprint q-bio/0508044.
    1. Campbell-Meiklejohn DK, Woolrich MW, Passingham RE, Rogers RD (2008) Knowing when to stop: the brain mechanisms of chasing losses. Biol Psychiatry 63:293–300. 10.1016/j.biopsych.2007.05.014 - DOI - PubMed
    1. Choi JS, Shin YC, Jung WH, Jang JH, Kang DH, Choi CH, Choi SW, Lee JY, Hwang JY, Kwon JS (2012) Altered brain activity during reward anticipation in pathological gambling and obsessive-compulsive disorder.
    1. Clark L, Bechara A, Damasio H, Aitken M, Sahakian B, Robbins T (2008) Differential effects of insular and ventromedial prefrontal cortex lesions on risky decision-making. Brain 131:1311–1322. 10.1093/brain/awn066 - DOI - PMC - PubMed

Publication types

LinkOut - more resources