Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Oct 5;5(4):ENEURO.0331-18.2018.
doi: 10.1523/ENEURO.0331-18.2018. eCollection 2018 Jul-Aug.

Selective Effects of the Loss of NMDA or mGluR5 Receptors in the Reward System on Adaptive Decision-Making

Affiliations

Selective Effects of the Loss of NMDA or mGluR5 Receptors in the Reward System on Adaptive Decision-Making

Przemysław Eligiusz Cieślak et al. eNeuro. .

Abstract

Selecting the most advantageous actions in a changing environment is a central feature of adaptive behavior. The midbrain dopamine (DA) neurons along with the major targets of their projections, including dopaminoceptive neurons in the frontal cortex and basal ganglia, play a key role in this process. Here, we investigate the consequences of a selective genetic disruption of NMDA receptor and metabotropic glutamate receptor 5 (mGluR5) in the DA system on adaptive choice behavior in mice. We tested the effects of the mutation on performance in the probabilistic reinforcement learning and probability-discounting tasks. In case of the probabilistic choice, both the loss of NMDA receptors in dopaminergic neurons or the loss mGluR5 receptors in D1 receptor-expressing dopaminoceptive neurons reduced the probability of selecting the more rewarded alternative and lowered the likelihood of returning to the previously rewarded alternative (win-stay). When observed behavior was fitted to reinforcement learning models, we found that these two mutations were associated with a reduced effect of the expected outcome on choice (i.e., more random choices). None of the mutations affected probability discounting, which indicates that all animals had a normal ability to assess probability. However, in both behavioral tasks animals with targeted loss of NMDA receptors in dopaminergic neurons or mGluR5 receptors in D1 neurons were significantly slower to perform choices. In conclusion, these results show that glutamate receptor-dependent signaling in the DA system is essential for the speed and accuracy of choices, but at the same time probably is not critical for correct estimation of probable outcomes.

Keywords: decision-making; dopamine; glutamate receptors; mouse behavior; reinforcement learning.

PubMed Disclaimer

Figures

None
Graphical abstract
Figure 1.
Figure 1.
The probabilistic reinforcement learning task. A, Schematic representation of the probabilistic reinforcement learning task. The animal could make a nose-poke in one of two ports. Following a nose-poke, water could have been delivered with the probability depending on the chosen port. The nose-poke ports were randomly assigned 80% or 20% reward probabilities. During each session, the reward probabilities were reversed after 60 trials. B, An example the choice behavior of a mouse in 600 trials (sessions 6–10). The black line shows the probability of choosing the left side (data smoothed with the 21 point moving average). The cyan bars indicate the side with the higher probability of reward delivery. The red dashed line indicates session boundaries. C–H, Probability of selecting the alternative with the higher reward probability by the NR1DATCreERT2 (mutant, n = 6; control, n = 8; C, F), mGluR5KD-D1 (mutant, n = 8; control, n = 9; D, G), and NR1D1CreERT2 (mutant, n = 6; control, n = 9; E, H) strains. C–E, Session-by-session analysis; data were collapsed across trials. F–H, Trial-by-trial analysis; data were collapsed across sessions. Data are represented as the mean ± SEM.
Figure 2.
Figure 2.
Computational modeling results. A–C, Density plots of posterior group parameter distributions with the best model (model 3) for the NR1DATCreERT2 (A), mGluR5KD-D1 (B), and NR1D1CreERT2 (C) strains. Credible differences are marked with stars, and vertical bars below the plots show 95% HDI ranges.
Figure 3.
Figure 3.
Effects of previous outcomes on choice. A–C, Probabilities of repeating the same choice when the previous response was rewarded (win-stay) or switching to an alternative choice when the preceding response yielded no reward (lose-shift) in the NR1DATCreERT2 (mutant, n = 6; control, n = 8; A), mGluR5KD-D1 (mutant, n = 8; control, n = 9; B), and NR1D1CreERT2 (mutant, n = 6; control, n = 9; C) strains. The probability of win-stay was calculated as the number of times the animal chose the same side as the side chosen during the previously rewarded trial divided by the total number of rewarded trials, while the lose-shift probability was calculated as the number of times the animal changed its choice when the preceding response yielded no reward divided by the total number of unrewarded trials. D–F, Simulation performance of the best model (model 3) with respect to mimicking win-stay/lose-shift choice behavior. Data are represented as the mean ± SEM. **p < 0.01 (t test).
Figure 4.
Figure 4.
Reaction times in the probabilistic reinforcement learning task. A–I, Graphs show the reaction times observed in the NR1DATCreERT2 (mutant, n = 6; control, n = 8; A–C), mGluR5KD-D1 (mutant, n = 8; control, n = 9; D–F), and NR1D1CreERT2 (mutant, n = 6; control, n = 9; G–I) strains. A, D, and G show the time elapsed from the trial onset to the choice port entry. B, E, and H show the time from the new trial onset to the choice port entry following previously unrewarded (lose) or rewarded (win) trials. C, F, and I summarize the time from the reward delivery to the reward port entry. Values represent the mean choice latency (all sessions combined) ± SEM. *p < 0.05, **p < 0.01, ***p < 0.001 (Bonferroni-corrected t test or t test).
Figure 5.
Figure 5.
The probability-discounting task. A, Schematic representation of the probability-discounting task. One nose-poke port was associated with the delivery of small certain rewards, while the other nose-poke port was associated with the delivery of large uncertain rewards. Each session consisted of 20 forced trials during which only one port was active, followed by 40 free choice trials during which both ports were active. B–D, The graphs show the frequency of choosing the larger reward as a function of its probability in the NR1DATCreERT2 (mutant, n = 6; control, n = 7; B), mGluR5KD-D1 (mutant, n = 8; control, n = 9; C), and NR1D1CreERT2 (mutant, n = 5; control, n = 9; D) strains. Data are represented as the mean ± SEM.
Figure 6.
Figure 6.
Reaction times in the probability-discounting task. A–C, Time elapsed from the trial onset to the choice port entry during the forced choice (left) and free choice (right) trials in the NR1DATCreERT2 (mutant, n = 6; control, n = 7; A), mGluR5KD-D1 (mutant, n = 8; control, n = 9; B), and NR1D1CreERT2 (mutant, n = 5; control, n = 9; C) strains. Bars represent the mean choice latency ± SEM. *p < 0.05, **p < 0.01, ***p < 0.001 (Bonferroni-corrected t test).

Similar articles

Cited by

References

    1. Ahn W-Y, Busemeyer JR, Wagenmakers E-J, Stout JC (2008) Comparison of decision learning models using the generalization criterion method. Cogn Sci 32:1376–1402. 10.1080/03640210802352992 - DOI - PubMed
    1. Ahn W-Y, Krawitz A, Kim W, Busmeyer JR, Brown JW (2011) A model-based fMRI analysis with hierarchical bayesian parameter estimation. J Neurosci Psychol Econ 4:95–110. 10.1037/a0020684 - DOI - PMC - PubMed
    1. Ahn W-Y, Vasilev G, Lee S-H, Busemeyer JR, Kruschke JK, Bechara A, Vassileva J (2014) Decision-making in stimulant and opiate addicts in protracted abstinence: evidence from computational modeling with pure users. Front Psychol 5:849. 10.3389/fpsyg.2014.00849 - DOI - PMC - PubMed
    1. Ahn W-Y, Haines N, Zhang L (2017) Revealing neurocomputational mechanisms of reinforcement learning and decision-making with the hBayesDM package. Comput Psychiatry 1:24–57. 10.1162/CPSY_a_00002 - DOI - PMC - PubMed
    1. Balleine BW, Delgado MR, Hikosaka O (2007) The role of the dorsal striatum in reward and decision-making. J Neurosci 27:8161–8165. 10.1523/JNEUROSCI.1554-07.2007 - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources