Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Apr 16:8:47.
doi: 10.3389/fncom.2014.00047. eCollection 2014.

An extended reinforcement learning model of basal ganglia to understand the contributions of serotonin and dopamine in risk-based decision making, reward prediction, and punishment learning

Affiliations

An extended reinforcement learning model of basal ganglia to understand the contributions of serotonin and dopamine in risk-based decision making, reward prediction, and punishment learning

Pragathi P Balasubramani et al. Front Comput Neurosci. .

Abstract

Although empirical and neural studies show that serotonin (5HT) plays many functional roles in the brain, prior computational models mostly focus on its role in behavioral inhibition. In this study, we present a model of risk based decision making in a modified Reinforcement Learning (RL)-framework. The model depicts the roles of dopamine (DA) and serotonin (5HT) in Basal Ganglia (BG). In this model, the DA signal is represented by the temporal difference error (δ), while the 5HT signal is represented by a parameter (α) that controls risk prediction error. This formulation that accommodates both 5HT and DA reconciles some of the diverse roles of 5HT particularly in connection with the BG system. We apply the model to different experimental paradigms used to study the role of 5HT: (1) Risk-sensitive decision making, where 5HT controls risk assessment, (2) Temporal reward prediction, where 5HT controls time-scale of reward prediction, and (3) Reward/Punishment sensitivity, in which the punishment prediction error depends on 5HT levels. Thus the proposed integrated RL model reconciles several existing theories of 5HT and DA in the BG.

Keywords: Decision Making; Punishment; Reinforcement Learning; Reward; Risk; basal ganglia; dopamine; serotonin.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Selection of the blue flowers obtained from our simulation (Sims) as an average of 1000 instances, that adapted from Real (1981) experiment (Expt), and red line indicating contingency reversal.
Figure 2
Figure 2
Comparison between the experimental and simulated results for the (A) overall choice (B) Unequal EV (C) Equal EV, under RTD and Baseline (control) condition. Error bars represent the SE with size “N” = 100.The experiment (Expt) and the simulation (Sims) result of any condition did not reject the null hypothesis, which proposes no difference between means, with P value > 0.05. Here the experimental results are adapted from Long et al. (2009).
Figure 3
Figure 3
(A) Selection of the long term reward as a function of α. Increasing γ increased the frequency of selecting the larger and more delayed reward. Increasing α also gave similar results for a fixed γ. (B) Differences in the utilities (U) between the yellow and white panels averaged across trials for the states, st, as a function of γ and α. Here N = 2000.
Figure 4
Figure 4
The mean number of errors in non-switch trials (A) as a function of “α” and outcome trial type; “α = 0.5” (balanced) and “α = 0.3” (Tryptophan depletion). Error bars represent standard errors of the difference as a function of “α” in simulation for size “N” = 100 (Sims). (B) Experimental error percentages adapted from Cools et al. (2008). Error bars represent standard errors as a function of drink in experiment (Expt). The results in (B) were reported after the exclusion of the trials from the acquisition stage of each block.
Figure 5
Figure 5
The mean number of errors in non-switch trials as a function condition; Simulation (sims): “α = 0.5” (balanced) and “α = 0.3” (Tryptophan depletion). Experimental (Expt) results adapted from Cools et al. (2008). Error bars represent standard errors either as a function of drink in experiment, or α in simulation for size “N” = 100.

References

    1. Acheson A., Farrar A. M., Patak M., Hausknecht K. A., Kieres A. K., Choi S., et al. (2006). Nucleus accumbens lesions decrease sensitivity to rapid changes in the delay to reinforcement. Behav. Brain Res. 173, 217–228 10.1016/j.bbr.2006.06.024 - DOI - PMC - PubMed
    1. Alex K. D., Pehek E. A. (2007). Pharmacologic mechanisms of serotonergic regulation of dopamine neurotransmission. Pharmacol. Ther. 113, 296–320 10.1016/j.pharmthera.2006.08.004 - DOI - PMC - PubMed
    1. Angiolillo P. J., Vanderkooi J. M. (1996). Hydrogen atoms are produced when tryptophan within a protein is irradiated with ultraviolet light. Photochem. Photobiol. 64, 492–495 10.1111/j.1751-1097.1996.tb03095.x - DOI - PubMed
    1. Azmitia E. C. (1999). Serotonin neurons, neuroplasticity, and homeostasis of neural tissue. Neuropsychopharmacology 21, 33S–45S 10.1016/S0893-133X(99)00022-6 - DOI - PubMed
    1. Azmitia E. C. (2001). Modern views on an ancient chemical: serotonin effects on cell proliferation, maturation, and apoptosis. Brain Res. Bull. 56, 413–424 10.1016/S0361-9230(01)00614-1 - DOI - PubMed

LinkOut - more resources