Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jun 15;90(6):1312-1324.
doi: 10.1016/j.neuron.2016.04.043. Epub 2016 May 26.

Endocannabinoid Modulation of Orbitostriatal Circuits Gates Habit Formation

Affiliations

Endocannabinoid Modulation of Orbitostriatal Circuits Gates Habit Formation

Christina M Gremel et al. Neuron. .

Abstract

Everyday function demands efficient and flexible decision-making that allows for habitual and goal-directed action control. An inability to shift has been implicated in disorders with impaired decision-making, including obsessive-compulsive disorder and addiction. Despite this, our understanding of the specific molecular mechanisms and circuitry involved in shifting action control remains limited. Here we identify an endogenous molecular mechanism in a specific cortical-striatal pathway that mediates the transition between goal-directed and habitual action strategies. Deletion of cannabinoid type 1 (CB1) receptors from cortical projections originating in the orbital frontal cortex (OFC) prevents mice from shifting from goal-directed to habitual instrumental lever pressing. Activity of OFC neurons projecting to dorsal striatum (OFC-DS) and, specifically, activity of OFC-DS terminals is necessary for goal-directed action control. Lastly, CB1 deletion from OFC-DS neurons prevents the shift from goal-directed to habitual action control. These data suggest that the emergence of habits depends on endocannabinoid-mediated attenuation of a competing circuit controlling goal-directed behaviors.

PubMed Disclaimer

Conflict of interest statement

Author information: author deposition statement, competing interest declarations. The authors have no competing financial interests.

Figures

Figure 1
Figure 1. Within subject shifting between goal-directed and habitual actions
A, Acquisition schematic of lever-pressing for a food outcome under random interval (RI) and random ratio (RR) schedules of reinforcement. The same mouse is placed in two operant chambers distinguished by contextual cues, in successive order where they are trained to press the same lever (e.g., left lever) for the same outcome (food pellets vs. sucrose solution) (e.g., food pellets). The bias towards goal-directed actions is generated through use of random ratio (RR) schedules of reinforcement, where the reinforcer is delivered following on average the n lever press (2 days n = 10 followed by 4 days of n = 20). In contrast, random interval (RI) reinforcement schedules are used to bias towards use of habitual actions, with the reinforcer delivered following the first lever press after on average an interval of t has passed (2 days of t = 30 s followed by 4 days of t = 60s). Each day following lever press training, the other outcome (e.g. sucrose) is provided in the home cage. B, Response rate for a control cohort under RI and RR schedules across acquisition. C, Schematic of outcome devaluation procedure. On the Valued day (V) mice are fed (1h) a control outcome (e.g. sucrose) that they have experienced in their home cage. On the Devalued day (DV), mice are prefed the outcome associated with the lever press (e.g., food pellet). Following prefeeding, mice are placed into the RI and RR contexts, and lever presses are measured for 5 min in the absence of reinforce delivery. D, Lever pressing in V and DV states in RI (grey) and RR (black) contexts. E, Distribution of lever presses between V and DV days in RI and RR training contexts. F, Within-subject devaluation indexes in previously trained RI and RR contexts, reflecting potential shifts in the magnitude of devaluation. Individual results and mean ± SEM are shown. * = p < 0.05.
Figure 2
Figure 2. Deletion of CB1 receptor from OFC neurons impairs habitual action control
A, Schematic of viral strategy and ex vivo physiological assessment. B, Cre-dependent ChR2 eYFP detected at OFC injection site (left) and downstream DS (right) in CB1floxCamKII Cre mice. C, Representative traces showing assessment of light-evoked excitatory post-synaptic currents (oEPSCs) in DS MSNs, with the blue circle indicating the light pulse (473 nm wavelength, 5 ms) in CB1floxCamKII Cre mice and wild-type littermates. WIN55, 212, is a CB1 receptor agonist. D, Relative amplitude of DS MSN oEPSCs following WIN55, 212 application (10 min, 1 μM) differed between wild-type and CB1floxCre /CB1floxCamKII Cre mice. E, Experimental design schematic. F, Lever presses made following outcome devaluation procedures in valued (V) and devalued (DV) states across RI and RR training contexts for the different treatment groups. G, Normalized lever presses during outcome devaluation testing, showing the distribution of lever-presses between V and DV states in the different training contexts (RI and RR) across the different treatment groups. H, Shifts in outcome devaluation index between RI and RR training contexts for Control mice, CB1floxCre mice, and CB1floxCamKII Cre mice. Individual results and mean ± SEM are shown. * = p < 0.05, # = p = 0.08. See also Supplementary Fig. 1 and 2.
Figure 3
Figure 3. Activity in OFC-DS neurons is necessary for the shift to goal-directed action control
A, Schematic of combinatorial retrograde and AAV viral strategy. B, Schematic of experimental design with devaluation testing performed following CNO administration. C, hM4D mCherry expression in OFC. D–E, Representative traces showing the ability of injected current (-200 to +300 pA, 100 pA steps) to evoke an action potential in OFC neurons (baseline) and following vehicle (DMSO; D) and CNO (10 μM; E) application. F, Lever presses during outcome devaluation testing across valued (V) and devalued (DV) states in RI and RR training contexts. G, Normalized lever presses during outcome devaluation testing showing the distribution of lever presses between V and DV states in the previously RI and RR trained contexts. H, Devaluation index for each group of mice in the previously trained RI and RR contexts. Individual results and mean ± SEM are shown. * = p < 0.05. See also Supplementary Fig. 3.
Figure 4
Figure 4. Attenuating OFC-DS transmission disrupts goal-directed control
A, Schematic of combinatorial AAV viral strategy. B, Schematic of experimental design with devaluation testing performed following intra-cranial CNO or saline administration. C, (left panel) DIO hM4D mCherry expression in OFC, (center panel) cannula placement within DS, (right panel) DS insert from block outline in center panel showing DIO hM4D mCherry fiber expression in DS. D, Lever presses during outcome devaluation testing across valued (V) and devalued (DV) states in RI and RR training contexts. E, Normalized lever presses during outcome devaluation testing showing the distribution of lever presses between V and DV states in the previously RI and RR trained contexts. F, Devaluation index for each group of mice in the previously trained RI and RR contexts. Individual results and mean ± SEM are shown. * = p < 0.05. See also Supplementary Fig. 4.
Figure 5
Figure 5. Deletion of CB1 receptors in OFC-DS neurons prevents habitual control over actions
A, Schematic of combinatorial viral strategy and ex vivo physiological assessment. B, fp dependent Cre-mCherry detected in OFC (left) and in downstream DS (right). C, Representative traces showing ChR2-mediated firing of an OFC neuron. D, Schematic of experimental design. E, Representative traces showing assessment of DS MSN oEPSCs in a subset of OFC-DS CB1flox mice (n = 2) and wild-type littermate (Ctl) (n = 1) mice in the absence (left) and presence (right) of WIN55, 212. F, Relative amplitude of DS MSN oEPSCs following WIN55, 212 application in Ctl and OFC-DS CB1flox mice. G, Lever presses in valued (V) and (devalued (DV) states, in RI and RR training contexts. Individual results and mean ± SEM are shown. H, Normalized lever presses during outcome devaluation testing showing the distribution of lever presses across V and DV states in the different training contexts (RI and RR). I, Devaluation index plotted within-subject for RI and RR training contexts for control mice and OFC-DS CB1flox mice. Individual results and mean ± SEM are shown. * = p < 0.05. See also Supplementary Fig. 5.

References

    1. Adams CD. Variations in the sensitivity of instrumental responding to reinforcer devaluation. The Quarterly journal of experimental psychology B, Comparative and physiological psychology. 1982;34:77–98.
    1. Adams CD, Dickinson A. Instrumental Responding Following Reinforcer Devaluation. Q J Exp Psychol-B. 1981;33:109–121.
    1. Ahmari SE, Spellman T, Douglass NL, Kheirbek MA, Simpson HB, Deisseroth K, Gordon JA, Hen R. Repeated Cortico-Striatal Stimulation Generates Persistent OCD-Like Behavior. Science. 2013;340:1234–1239. doi: 10.1126/science.1234733. - DOI - PMC - PubMed
    1. Armbruster BN, Li X, Pausch MH, Herlitze S, Roth BL. Evolving the lock to fit the key to create a family of G protein-coupled receptors potently activated by an inert ligand. n.d pnas.org. - PMC - PubMed
    1. Barnes TD, Kubota Y, Hu D, Jin DZ, Graybiel AM. Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories. Nature. 2005 doi: 10.1038/nature04053. Published online: 18 June 2008; | doi:10.1038/nature06993 437, 11581038/nature06993 437 1158 1161. - DOI - PubMed

Publication types

LinkOut - more resources