Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar;639(8053):143-152.
doi: 10.1038/s41586-024-08412-x. Epub 2024 Nov 25.

Opponent control of reinforcement by striatal dopamine and serotonin

Affiliations

Opponent control of reinforcement by striatal dopamine and serotonin

Daniel F Cardozo Pinto et al. Nature. 2025 Mar.

Abstract

The neuromodulators dopamine (DA) and serotonin (5-hydroxytryptamine; 5HT) powerfully regulate associative learning1-8. Similarities in the activity and connectivity of these neuromodulatory systems have inspired competing models of how DA and 5HT interact to drive the formation of new associations9-14. However, these hypotheses have not been tested directly because it has not been possible to interrogate and manipulate multiple neuromodulatory systems in a single subject. Here we establish a mouse model that enables simultaneous genetic access to the brain's DA and 5HT neurons. Anterograde tracing revealed the nucleus accumbens (NAc) to be a putative hotspot for the integration of convergent DA and 5HT signals. Simultaneous recording of DA and 5HT axon activity, together with genetically encoded DA and 5HT sensor recordings, revealed that rewards increase DA signalling and decrease 5HT signalling in the NAc. Optogenetically dampening DA or 5HT reward responses individually produced modest behavioural deficits in an appetitive conditioning task, while blunting both signals together profoundly disrupted learning and reinforcement. Optogenetically reproducing DA and 5HT reward responses together was sufficient to drive the acquisition of new associations and supported reinforcement more potently than either manipulation did alone. Together, these results demonstrate that striatal DA and 5HT signals shape learning by exerting opponent control of reinforcement.

PubMed Disclaimer

Conflict of interest statement

Competing interests: N.E. is a consultant for Boehringer Ingelheim. B.S.B. is a co-founder of Magnus Medical. R.C.M. is on the scientific advisory boards of MapLight Therapeutics, MindMed, Bright Minds Biosciences and Aelis Farma. D.F.C.P., M.B.P., M.Y.G., G.C.T. and A.P.F.C. declare no competing interests.

Figures

Extended Data Fig. 1:
Extended Data Fig. 1:. DAT-Cre+/−;SERT-Flp+/− mice enable orthogonal and specific access to VTADA and DR5HT neurons
a, Surgical strategy to validate orthogonality of genetic access to VTADA and DR5HT neurons in DAT-Cre+/−;SERT-Flp+/− mice. b, Example image showing negligible Flp-dependent EYFP expression in the VTA. c-d, Example images of the DR showing Cre-dependent mCherry expression is restricted to DRDA neurons (c) and is not observed in DR5HT neurons (d). e, Surgical strategy for control experiments to validate the specificity of our viral targeting strategy. f-g, example images showing negligible Cre-dependent mCherry expression in the VTA in the absence of Cre (f) and negligible Flp-dependent EYFP expression in the DR in the absence of Flp (g). In a-d, n = 1 mouse. In e-f, n = 1 mouse.
Extended Data Fig. 2:
Extended Data Fig. 2:. Overlap between VTADA and DR5HT axons varies across limbic regions
a, Surgical strategy for VTADA and DR5HT axon tracing experiments. b, Example images showing labeled VTADA and DR5HT axons in sagittal sections (top). Insets (center, bottom) correspond to the boxed regions in the top images. c, Relative density of VTADA (left) and DR5HT (right) axons across limbic regions. d, Background subtracted (left) and segmented (center) images showing VTADA and DR5HT axons in the anterior NAc. Insets (right) show magnified views of the corresponding boxed areas in the left and center images. e, same as d, but for the posterior NAc. f, same as d, but for the anterior BLA. g, same as d, but for the posterior BLA. h, same as d, but for the Ant Ctx. i, Relative colocalization between VTADA and DR5HT axons across the regions shown in d-h. In c and i, n = 5 mice. Data are shown as mean +/− s.e.m. and significance is denoted as *P<0.05, **P<0.01, ***P<0.001, ****P<0.0001. See Supplementary Table 1 for statistics.
Extended Data Fig. 3:
Extended Data Fig. 3:. NAc-projecting DR5HT neurons are distinct from CeA- and OFC- projecting DR5HT subsystems
a, Surgical strategy for retrograde labeling of projection-defined DR5HT subsystems. b, Example images of the DR showing retrogradely labeled neurons (posterior DR image reproduced here from Fig. 1m for comparison). c, Percentage of OFC-, CeA-, and NAc- projecting DR neurons that are TpH+. d, Pie graphs showing the fraction of OFC- (left), CeA- (center), and NAc- (right) projecting DR5HT neurons that send axon collaterals to the other two target regions. e, NAc- projecting DR5HT neurons are a distinct population from then CeA- and OFC- projecting DR5HT subsystems. f, Distributions of OFC- (left), CeA- (center), and NAc- (right) projecting DR5HT neurons across the DR’s anteroposterior axis. g, same as f, but across the dorsomedial (dm), ventromedial (vm), and lateral (l) subregions shown in b. h, Injection strategy (top) and example injection site images (bottom) for retrograde tracing control experiments. I, Example images showing retrogradely labeled cells in the DR. j, When all three retrograde tracers were injected together into the same target structure, the vast majority of labeled cells in the DR were positive for all three tracers. In a-g, n = 3 mice. In h-j, n = 2 mice. Data are shown as mean +/− s.e.m. and significance is denoted as *P<0.05, **P<0.01, ***P<0.001, ****P<0.0001. See Supplementary Table 1 for statistics.
Extended Data Fig. 4:
Extended Data Fig. 4:. Inverse VTADA and DR5HT axon reward responses are consistent across mice and are not explained by motion
a, Optical fiber tip placements for mice used in two-color axon photometry experiments. b, RCaMP2 (VTADA axon) recordings from individual mice aligned to CS-onset (top) or reward consumption (bottom) early (left) and late (right) in training. c, Same as b, but for GCaMP6 (DR5HT axon) recordings. d, Average GCaMP6, RCaMP2, and UV photometry traces from an example mouse (n = 25 trials from 1 mouse) aligned to shock onset. Top, demodulated and mean subtracted traces before motion correction, z-scoring, or smoothing; center, Z-scored GCaMP6 traces from the same session with and without motion correction; bottom, Z-scored RCaMP2 traces from the same session with and without motion correction (see Methods for motion correction details). e, same as d, but for photometry traces aligned to reward consumption (n = 35 trials from 1 mouse). Example traces in d-e are from the same mouse shown in Fig. 2b and Fig. 2k–l.
Extended Data Fig. 5:
Extended Data Fig. 5:. Fiber placement validation and additional analyses for GRAB sensor recording experiments in NAcpmSh
a, Optical fiber tip placements for mice used in GRAB-DA (top) and GRAB-5HT (bottom) experiments in the NAcpmSh. b, GRAB-DA recordings aligned to reward consumption showing the average response across trials for each mouse. c, Same as b, but for GRAB-5HT. d-e, GRAB-5HT recordings aligned to reward consumption during days 1–3 (d) and 5–7 (e) of a task where rewards were delivered randomly and without any predictive cues (Data are shown as mean +/− s.e.m.). For all panels, n = 5 mice per group.
Extended Data Fig. 6:
Extended Data Fig. 6:. Optical fiber placement validation and control assays for loss-of-function experiments in the NAcpmSh
a-d, Example images of the injection sites (DR, top left; VTA, top right) and optical fiber implantation sites (bottom) for the EYFP/EYFP (a), EYFP/NpHR (b), ChR2/EYFP (c), and ChR2/NpHR (d) groups. e-h, Optical fiber tip placements for mice in the EYFP/EYFP (e), EYFP/NpHR (f), ChR2/EYFP (g), and ChR2/NpHR (h) groups. i, Percent change in velocity during the light-on epochs relative to the light-off epochs in the open field test. j, Difference score for time spent on each side of the chamber in the RTPP task. In i-j: EYFP/EYFP, n = 9 mice; NpHR/EYFP, n = 8 mice; EYFP/ChR2, n = 7 mice; NpHR/ChR2, n= 9 mice. Data are shown as mean +/− s.e.m. and significance is denoted as *P<0.05, **P<0.01, ***P<0.001, ****P<0.0001. See Supplementary Table 1 for statistics.
Extended Data Fig. 7:
Extended Data Fig. 7:. GRAB sensor validation of the gain- and loss- of function manipulations of VTADA and DR5HT reward responses
a, Viral strategy enabling VTADA stimulation, VTADA inhibition, and GRAB-DA recordings in the same mouse (top); and, viral strategy enabling DR5HT stimulation, DR5HT inhibition, and GRAB-5HT recordings in the same mouse (bottom). b, GRAB-DA recordings aligned to sucrose consumption alone (gray, n = 913 trials from 11 mice) or sucrose consumption with VTADA inhibition (red, n = 1380 trials from 11 mice) in the same mice (top); and, GRAB-5HT recordings aligned to sucrose consumption alone (gray, n = 539 trials from 2 mice) or sucrose consumption with DR5HT stimulation (blue; n = 680 trials from 2 mice) in the same mice (bottom). c, GRAB-DA (top) recordings aligned to sucrose reward consumption (gray, n = 274 trials from 5 mice) or to the onset of VTADA stimulation (blue, n = 779 trials from 5 mice) in the same mice (top); and, GRAB-5HT recordings aligned to sucrose reward consumption (gray, n = 1006 trials from 12 mice) or to the onset of DR5HT inhibition (red, n = 1872 trials from 12 mice) in the same mice (bottom). Data are shown as mean +/− s.e.m.
Extended Data Fig. 8:
Extended Data Fig. 8:. Optical fiber placement validation and control assays for gain-of-function experiments in the NAcpmSh
a-d, Example images of the injection sites (DR, top left; VTA, top right) and optical fiber implantation sites (bottom) for the EYFP/EYFP (a), NpHR/EYFP (b), EYFP/ChR2 (c), and NpHR/ChR2 (d) groups. e-h, Optical fiber tip placements for mice in the EYFP/EYFP (e), NpHR/EYFP (f), EYFP/ChR2 (g), and NpHR/ChR2 (h) groups. Two mice in the NpHR/EYFP group died before their brains could be collected for histology. i, Percent change in velocity during the light-on epochs relative to the light-off epochs in the open field test (n = 6 mice per group). j, Number of trials of each type obtained during days 1–9 of training in the optogenetic conditioning task shown in Fig. 4m–p (n = 10 mice per group). Data are shown as mean +/− s.e.m. and significance is denoted as *P<0.05, **P<0.01, ***P<0.001, ****P<0.0001. See Supplementary Table 1 for statistics.
Extended Data Fig. 9:
Extended Data Fig. 9:. Optical fiber placement validation and individual mouse traces for GRAB sensor recording experiments in NAccore
a-b, Optical fiber tip placements for mice used in GRAB-DA (a) and GRAB-5HT (b) experiments in the NAccore. One mouse in the GRAB-5HT group died before its brain could be collected for histology. c, GRAB-DA recordings aligned to reward consumption showing the average response across trials for each mouse (n = 3 mice). d, Same as c, but for GRAB-5HT (n = 4 mice).
Extended Data Fig. 10:
Extended Data Fig. 10:. Optical fiber placement validation for loss-of-function experiments in the NAccore
a-b, Example images of the injection sites (DR, left; VTA, center) and optical fiber implantation sites (right) for the EYFP/EYFP (a), and ChR2/NpHR (b) groups. c-d, Optical fiber tip placements for the EYFP/EYFP (c), and ChR2/NpHR (d) groups.
Fig. 1:
Fig. 1:. Mapping convergent DA and 5HT inputs to limbic structures involved in learning
a, Schematic describing the generation of DAT-Cre+/−;SERT-Flp+/− mice enabling simultaneous and independent genetic access to DA and 5HT neurons. b, Viral strategy for labeling VTADA and DR5HT neurons in a single mouse. c, Example sagittal section depicting mCherry-expressing neurons in the VTA and EYFP-expressing neurons in the DR. d-e, Example images showing colocalization between Cre-dependent mCherry and TH in the VTA. f, Cell-type specificity quantification for VTADA neurons in DAT-Cre+/−;SERT-Flp+/− mice (n = 3 mice). g-h, Example images showing colocalization between Flp-dependent EYFP and TpH in the DR. i, Cell-type specificity quantification for DR5HT neurons in DAT-Cre+/−;SERT-Flp+/− mice (n = 3 mice). j, Example images of VTADA and DR5HT inputs to limbic structures. k, Relative colocalization between VTADA and DR5HT axons across limbic regions (left) and striatal subregions (right; n = 5 mice). l, Left, injection strategy to label projection-defined DA subsystems. Right, example image showing retrogradely labeled VTADA neurons. m, Left, injection strategy to label projection-defined 5HT subsystems. Right, example image showing retrogradely labeled DR5HT neurons. Note the lack of colocalization between Ctb-488 and the other two tracers in l-m. Data are shown as mean +/− s.e.m. and significance is denoted as *P<0.05, **P<0.01, ***P<0.001, ****P<0.0001. See Supplementary Table 1 for statistics.
Fig. 2:
Fig. 2:. Convergent DA and 5HT inputs to NAcpmSh show inverse responses to rewards
a, Surgical strategy to record VTADA and DR5HT axon calcium activity in the NAc. b, Confocal image of the recording site from an example mouse. c, Schematics of the fiber photometry system (left) and Pavlovian conditioning task (right). d, By the end of training, mice had acquired the CS-US association as indicated by a decreased latency to collect rewards following CS-onset (left) and an increased number of anticipatory port entries made during the first 5 s of the CS before US delivery (right; n =5 mice). e-j, Population recordings and max/min Z-score quantifications of VTADA and DR5HT axon calcium activity aligned to CS onset (top) or reward consumption (bottom) during early (left) and late (right) training. Neither VTADA nor DR5HT axons showed a CS response at any stage of training (e-g). Late in training, VTADA axons were excited by rewards while DR5HT axons were inhibited (h-j). k-l, Simultaneously recorded VTADA and DR5HT axon calcium responses in an individual mouse during late training aligned to CS-onset (k) and reward consumption (l). m, There was no correlation between the relative timing of the RCaMP2 max and GCaMP6 min (left), but we observed a negative correlation between the magnitude of the RCaMP2 max and the GCAMP6 min (right). n, Surgical strategy (left) and example image of the recording site (right) for GRAB sensor recordings in the NAcpmSh o-p, DA release increased (o) and 5HT release decreased (p) following reward consumption. In d-j and m, n = 5 mice. In o-p, n = 5 mice per group. Data are shown as mean +/− s.e.m. and significance is denoted as *P<0.05, **P<0.01, ***P<0.001, ****P<0.0001. See Supplementary Table 1 for statistics.
Fig. 3:
Fig. 3:. Blunting convergent DA and 5HT reward responses disrupts learning and reinforcement
a, Optogenetic strategy to blunt VTADA and/or DR5HT reward responses during learning. b, Images of injection/implantation sites from an example mouse. c, Schematic of the Pavlovian conditioning task. d, Number of rewards obtained over training. Inset, average rewards across days. e, Number of port entries over training. Inset, port entries on the final day of training. f, Median latency to enter the port after CS-onset over training. Inset, median latency across days. g, Probability of occupying the reward port as a function of time within a trial. Inset, average probability during the CS period. h, Baseline-normalized probability of occupying the reward port as a function of time within a trial. Inset, average normalized probability during the CS period. f, Percentage of time that mice occupied the port during each period of a trial, relative to the baseline period. j, Number of port entries during an extinction session. k, Latency to enter reward port after CS-onset during an extinction session. l-m, Same as h, but during an extinction session. n, Percentage of time that mice occupied the port during first 5 trials of the extinction session. o, Distance traveled in the open field. p, Time spent per chamber in the real-time place preference test. q, Average body weight during training days. r, Amount of sucrose solution consumed during a free-access reward task. In d-n and q: EYFP/EYFP, n = 10 mice; NpHR/EYFP, n = 8 mice; EYFP/ChR2, n = 8 mice; NpHR/ChR2, n= 9 mice. In o-p: EYFP/EYFP, n = 9 mice; NpHR/EYFP, n = 8 mice; EYFP/ChR2, n = 7 mice; NpHR/ChR2, n= 9 mice. In r: EYFP/EYFP, n = 9 mice; NpHR/EYFP, n = 8 mice; EYFP/ChR2, n = 8 mice; NpHR/ChR2, n= 9 mice. Data are shown as mean +/− s.e.m. and significance is denoted as *P<0.05, **P<0.01, ***P<0.001, ****P<0.0001. See Supplementary Table 1 for statistics.
Fig. 4:
Fig. 4:. Integration of opponent DA and 5HT reward responses drives new learning
a, Optogenetic strategy to reproduce VTADA and/or DR5HT reward responses. b, Images of injection/implantation sites from an example mouse. c, Schematic of the CPP tasks. d-e, VTADA excitation and DR5HT inhibition together, but not either manipulation alone, produced CPP. f-h, Neither DR5HT inhibition alone (f) nor VTADA stimulation alone (g) produced CPP in the same mice that previously showed CPP for both manipulations together (h). Purple bars in h represent the same data as the purple bars in e. i, VTADA stimulation together with DR5HT inhibition produced a greater real-time place preference than either manipulation alone. j, VTADA stimulation alone, or together with DR5HT inhibition, increased locomotion in the open field test. k, Schematic of the optogenetic conditioning paradigm with three CS- (compound sound and port light cues) US- (VTADA stimulation and/or DR5HT inhibition) pairs. l, Mice acquired conditioned approach responses to CSs paired with VTADA stimulation alone, DR5HT inhibition alone, or both manipulations together (left), but conditioned responses were more accurate for the CS paired with both manipulations together. m, After training, CSs paired with VTADA stimulation alone, DR5HT inhibition alone, or both manipulations together all functioned as conditioned reinforcers, but responding was above chance level only for the CS paired with both manipulations together. n, During the primary reinforcement test, mice preferred VTADA stimulation and DR5HT inhibition delivered together to either manipulation alone. In d-i, n = 6 mice per group. In k-l: EYFP/EYFP, n = 6 mice; EYFP/NpHR, n = 6 mice; ChR2/EYFP, n = 5 mice; ChR2/NpHR, n = 6 mice. In n-p, n = 10 mice. In j, m-n dashed lines represent chance levels. Data are shown as mean +/− s.e.m. and significance is denoted as *P<0.05, **P<0.01, ***P<0.001, ****P<0.0001. See Supplementary Table 1 for statistics.
Fig. 5:
Fig. 5:. Opponent control of reinforcement by DA and 5HT generalizes to the NAccore
a, Surgical strategy to label DR5HT neurons projecting to different NAc subregions. a, Example image of the injection sites. c, Example images of retrogradely labeled DR5HT neurons showing colocalization between NAcpmSh-, NAccore-, and NAclatSh- projecting subpopulations (n = 2 mice). d-e, Surgical strategy (d) and example image of the recording site (e) for GRAB sensor recordings in the NAccore. f-g, DA release increased (f, n = 3 mice) and 5HT release decreased (g, n = 4 mice) following reward consumption. h, Optogenetic strategy to reproduce VTADA and/or DR5HT reward responses in the NAccore. i, Images of injection/implantation sites from an example mouse. j, Schematics of the RTPP tasks. k, DR5HT inhibition alone did not produce RTPP (EYFP/EYFP, n = 5 mice; ChR2/NpHR, n = 6 mice). l, VTADA stimulation drove RTPP in the ChR2/NpHR group (right, n = 6 mice) but not in EYFP/EYFP controls (left, n = 5 mice). m, Mice in the ChR2/NpHR group (right, n = 5–6 mice), but not the EYFP/EYFP group (left, n = 5 mice), preferred VTADA stimulation together with DR5HT inhibition compared to VTADA stimulation alone. n, Difference score for the red-light only RTPP experiment in k. o, Same as n but for the 12 mW experiment in l. p, Same as n, but for the 3 mW experiment in m. In n-p: EYFP/EYFP, n = 5 mice; ChR2/NpHR, n = 6 mice. q-r, In a 4-choice RTPP task, mice showed a within-session place preference for VTADA stimulation delivered together with DR5HT inhibition compared to either manipulation alone (n = 5 mice). Data are shown as mean +/− s.e.m. and significance is denoted as *P<0.05, **P<0.01, ***P<0.001, ****P<0.0001. See Supplementary Table 1 for statistics.

References

    1. Schultz W, Dayan P & Montague PR A Neural Substrate of Prediction and Reward. Science (1979) 275, 1593–1599 (1997). - PubMed
    1. Steinberg EE et al. A causal link between prediction errors, dopamine neurons and learning. Nat Neurosci 16, 966–973 (2013). - PMC - PubMed
    1. Saunders BT, Richard JM, Margolis EB & Janak PH Dopamine neurons create Pavlovian conditioned stimuli with circuit-defined motivational properties. Nat Neurosci 21, 1072–1083 (2018). - PMC - PubMed
    1. Sengupta A & Holmes A A Discrete Dorsal Raphe to Basal Amygdala 5-HT Circuit Calibrates Aversive Memory. Neuron 103, 489–505.e7 (2019). - PMC - PubMed
    1. Zeng J et al. Local 5-HT signaling bi-directionally regulates the coincidence time window for associative learning. Neuron 111, 1118–1135.e5 (2023). - PMC - PubMed

ADDITIONAL REFERENCES

    1. Cardozo Pinto DF et al. Characterization of transgenic mouse models targeting neuromodulatory systems reveals organizational principles of the dorsal raphe. Nat Commun 10, 443 (2019). - PMC - PubMed
    1. Otsu N A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9, 62–66 (1979).
    1. Bunin MA & Wightman RM Quantitative evaluation of 5-hydroxytryptamine (serotonin) neuronal release and uptake: an investigation of extrasynaptic transmission. The Journal of Neuroscience 18, 4854–4860 (1998). - PMC - PubMed
    1. Liu C, Goel P & Kaeser PS Spatial and temporal scales of dopamine transmission. Nat Rev Neurosci 22, 345–358 (2021). - PMC - PubMed
    1. Paxinos G, Franklin KBJ & Franklin KBJ The mouse brain in stereotaxic coordinates. (Academic Press, 2001).

LinkOut - more resources