. 2025 Jul 25;11(30):eadt4945.

doi: 10.1126/sciadv.adt4945. Epub 2025 Jul 23.

Selective engagement of prefrontal VIP neurons in reversal learning

Jee Hyun Yi¹, Young Ju Yoon¹, Huijeong Jeong², Seo Yeon Choe¹, Min Whan Jung^{1

3}

Affiliations

¹ Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Daejeon 34141, Republic of Korea.
² Department of Neurology, University of California, San Francisco, San Francisco, CA 94158, USA.
³ Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea.

PMID: 40700480
PMCID: PMC12285700
DOI: 10.1126/sciadv.adt4945

Selective engagement of prefrontal VIP neurons in reversal learning

Jee Hyun Yi et al. Sci Adv. 2025.

. 2025 Jul 25;11(30):eadt4945.

doi: 10.1126/sciadv.adt4945. Epub 2025 Jul 23.

Authors

Jee Hyun Yi¹, Young Ju Yoon¹, Huijeong Jeong², Seo Yeon Choe¹, Min Whan Jung^{1

3}

Affiliations

¹ Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Daejeon 34141, Republic of Korea.
² Department of Neurology, University of California, San Francisco, San Francisco, CA 94158, USA.
³ Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea.

PMID: 40700480
PMCID: PMC12285700
DOI: 10.1126/sciadv.adt4945

Abstract

To gain insights into neural mechanisms enabling behavioral adaptations to complex and multidimensional environmental dynamics, we examined roles of vasoactive intestinal polypeptide (VIP)-expressing neurons in mouse medial prefrontal cortex (mPFC) in probabilistic reversal learning. Behaviorally, manipulating VIP neuronal activity left probabilistic classical conditioning unaffected but severely impaired reversal learning. Physiologically, conditioned cue-associated VIP neuronal responses changed abruptly after encountering an unexpected reward. They also conveyed strong reward prediction error signals during behavioral reversal, but not before or after, unlike pyramidal neurons that consistently conveyed error signals throughout all phases. Furthermore, the signal's persistence across trials correlated with reversal learning duration. These results suggest that mPFC VIP neurons play crucial roles in rapid reversal learning, but not in gradual value updating under stable probabilistic contingencies, by monitoring salient deviations from ongoing environmental contingencies and imposing error-correction signals during behavioral adjustments. These findings shed light on the intricate cortical circuit dynamics underpinning behavioral flexibility in complex, multifaceted environments.

PubMed Disclaimer

Figures

**Fig. 1.. Chemogenetic modulation of VIP neuronal activity does not impair probabilistic classical conditioning.**
(A) A schematic for task 1, used for chemogenetic and optogenetic experiments. Head-fixed mice performed a probabilistic classical conditioning task in which two odor cues (CS_rw) were paired with water reward, while two others (CS_pn) were paired with air puff (punishment) at 75% probability. (B) A schematic for viral vector injection and histological confirmation of hM3D(Gq) expression in the mPFC. Scale bar, 500 μm. (C) Sample DMSO (left) and CNO (right) sessions showing mean anticipatory lick numbers (1-s window since cue offset) across trials (moving average of 60 trials) during daily sessions until reaching performance criterion before reversal onset. The cue configuration was identical to that at the end of the previous session. Blue and red traces denote reward-predicting (CS_rw) and punishment-predicting (CS_pn) cues, respectively. (D) Mean anticipatory lick difference between CS_rw and CS_pn trials under DMSO or CNO treatment for hM3D(Gq) (left) and hM4D(Gi) (right) groups. (E) The number of trials to performance criterion with DMSO or CNO treatment. Not significant (n.s.), P > 0.05, one-way repeated-measures ANOVA. (F) Sample DMSO (red) and CNO (gray) sessions from an hM4D(Gi)-expressing mouse showing lick rates (500-ms moving average, 100-ms steps) grouped by reward delivery in the previous trial (solid and dashed lines; rewarded and unrewarded, respectively). Only consecutive CS_rw trials were analyzed. (G) Anticipatory lick numbers grouped by drug and reward delivery in the previous trial for two consecutive CS_rw presentations. *P < 0.05; ***P < 0.001, two-way repeated-measures ANOVA. Gray lines, individual animal data. Shading [(C) and (F)], SEM across trials. Bar graphs and error bars [(D), (E), and (G)], mean and SEM across animals [hM3D(Gq), n = 13; hM4D(Gi), n = 7 mice].

**Fig. 2.. Chemogenetic modulation of VIP neuronal activity impairs reversal learning.**
(A) A schematic for reversal learning in task 1. One set of reward-predicting (CS_rw) and punishment-predicting (CS_pn) cues was subject to cue-outcome contingency reversal. (B) A sample session showing changes in cue-dependent anticipatory licking in the course of reversal learning. Saturated colors indicate the cue set subjected to reversal (blue, CS_rw→pn; red, CS_pn→rw), while soft colors denote the other cues (blue, CS_rw; red, CS_pn). The same format as in Fig. 1C. (C and D) Effects of chemogenetic activation [hM3D(Gq); left], chemogenetic inactivation [hM4D(Gi); middle], and control manipulation (mCherry; right) on reversal learning. (C) Anticipatory lick numbers in response to CS_rw→pn (blue) and CS_pn→rw (red) in the course of reversal learning (60-trial moving average, 15-trial steps). Thin lines, individual animal data. Thick lines, their averages. Black square, significant (P < 0.05, t test) difference between two trial types (CS_rw→pn versus CS_pn→rw). Black dashed lines, first occurrence of significantly higher CS_pn→rw anticipatory lick frequency since reversal onset. (D) The number of trials to reversal criterion with DMSO or CNO treatment. Gray lines, individual animal data. Bar graphs, mean across animals. *P < 0.05; **P < 0.01; ***P < 0.001, Bonferroni post hoc test following one-way repeated-measures ANOVA. Error bars and shading, SEM across trials (B) or animals (else) [hM3D(Gq), n = 13; hM4D(Gi), n = 7; mCherry, n = 5 mice].

**Fig. 3.. VIP neurons carry diverse cue- and outcome-related signals during probabilistic classical conditioning.**
(A) A schematic for task 2, used for calcium imaging. Right: Time courses of cue-dependent lick rate during a sample session. (B) A schematic (left) and a coronal brain section (middle) showing the GRIN prism lens position and the spread of jGCaMP7f (green). Right: A sample field of view. Scale bars, (middle) 500 μm and (right) 100 μm. (C) Sample VIP neurons showing cue-dependent and outcome-dependent responses. (D) Mean baseline-normalized population responses of VIP neurons. (E) Left: Mean cue preference index (+ and −, higher activity in response to CS_rw and CS_pn, respectively) during the first delay period. Right: Mean outcome-preference indices (+ and −, higher activity in response to US delivery and US omission, respectively) for reward (Rw) and punishment (Pn) during the outcome period. *P < 0.05; **P < 0.01, difference from zero, linear mixed-effects model. (F and G) Outcome-period VIP neuronal activity was analyzed using multiple linear regression to relate t values of reward-predicting cue [CS_rw(t); Eq. 3] and reward [*Rw(t)*; Eq. 1; (F)], as well as punishment-predicting cue [CS_pn(t); Eq. 3] and punishment [*Pn(t)*; Eq. 2; (G)]. Each circle represents one VIP neuron. r, Pearson’s correlation coefficient; β, slope estimated by linear mixed-effects model. ***P < 0.001. The lines represent the fit of linear mixed-effects model. Error bars and shading, SEM across trials [(A) and (C)] or neurons [(D) and (E); 111 neurons recorded from nine animals].

**Fig. 4.. VIP neurons show heterogeneous cue and outcome response dynamics across reversal.**
(A) Top: A schematic for reversal learning in task 2. Bottom: Behavioral performance of all mice used for VIP neuronal recordings. Shown are cue-dependent anticipatory lick numbers during the second delay period in the course of reversal learning (moving average of 50-trials, 25-trial steps). The same format as in Fig. 2C. (B and C) Cue-dependent responses of sample VIP neurons before (left) and after (right) reversal. Top: Heatmaps of baseline-normalized fluorescence; Lower, their averages. Trials were grouped by odor cues [CS_rw→pn (blue) and CS_pn→rw (red)]. (D) Dynamics of cue-preference index (40-trial moving average; color coded) during the cue period (1.5 s since cue onset) in the course of reversal learning. Neurons were aligned according to their cue-preference index before reversal onset. (E) Scatter plot showing the cue-preference index during the cue period before (100 trials before reversal onset; abscissa) and after (trials 151 to 250 since reversal onset; ordinate) reversal. (F and G) Dynamics of cue-preference index during the first delay period (1.5 s since cue offset) in the course of reversal learning. (H to K) Dynamics of reward (H and I) and punishment-preference indices (J and K) during the outcome period (1.5 s since outcome onset). Error bars and shading, SEM across animals (A) or trials [(B) and (C)] (n = 106 neurons recorded from 12 animals).

**Fig. 5.. Cue-dependent VIP neuronal responses change abruptly since reversal onset.**
(A) Baseline-normalized individual-trial responses of a sample VIP neuron to CS_rw→pn before (four-trial average) and after the first unexpected reward delivery since reversal onset. (B) Group data showing means (±SEM across 106 neurons recorded from 12 mice) cue responses of VIP neurons during the first delay period, normalized to pre-reversal responses (100 trials; trial normalization), before and after experiencing the first unexpected outcome delivery. The abscissa denotes the order of the appearance of a given cue (CS_rw→pn or CS_pn→rw) around the first unexpected outcome delivery. Cue responses are shown as the mean of CS_rw→pn and CS_pn→rw responses. (C) The same format as in (B), but sessions were divided according to the type of the first unexpected outcome (reward versus punishment; 69 neurons from six animals and 37 neurons from six animals, respectively) since reversal onset. **P < 0.01; ***P < 0.001, difference from zero, linear mixed-effects model.

**Fig. 6.. VIP neurons carry error correction signals during reversal learning, but not before and after.**
(A) t values for CS_rw(t) and *Rw(t)* before, during, and after reversal [the trials in (B)]. (B) Temporal profile of r_CS-Rw during reversal (all VIP neurons; 100-trial moving window, 10-trial steps). Gray dashed lines, reversal onset and the mean number of trials until reversal criterion. Filled circles, P < 0.05 (difference from 0). (C) Mean temporal profiles of LDI (purple) and smoothed r_CS-Rw (100-trial moving window, one-trial steps, 40-trial smoothing; black) across animals during reversal. Colored dashed lines, mean saturation points for r_CS-Rw and LDI. (D) The relationship between the persistence of RPE signals and the number of trials to reversal criterion. Circles, individual animal data. (E) A schematic for calculating PR[CS_rw←Rw] in the ANCCR model. (F) Mean temporal profiles of LDI (purple) and absolute t value for *PR[CS*_rw←Rw](t) (green) obtained from multiple linear regression (Eq. 10) of outcome-period VIP neuronal responses (100-trial moving window, one-trial steps, 40-trial smoothing) across animals during reversal. Colored dashed lines, mean saturation points of the absolute t value for *PR[CS_rw←Rw](t)* and LDI. (G) The relationship between the persistence of PR[CS_rw←Rw] signals and the number of trials to reversal criterion. *P < 0.05; **P < 0.01; ***P < 0.001. Error bars and shading, SEM across animals.

**Fig. 7.. Optogenetic modulation of outcome-period VIP neuronal activity impairs reversal learning.**
(A) A schematic for viral vector injection and histological confirmation of ChRmine expression in the mPFC. Scale bar, 500 μm. (B) Laser stimulation (633 nm, 20 Hz, 5-ms pulse; pink shading) was given during the outcome period in task 1. (C) A schematic for two reversals per session (task 1). Laser stimulation was applied only during one of two reversals. (D) The number of trials to reversal criterion with (red) or without (gray) laser stimulation. Gray lines, individual animal data. **P < 0.01, t test. (E) Changes in anticipatory lick frequency during reversal (60-trial moving average, 15-trial step). The same format as in Fig. 2C. Error bars, SEM across animals (n = 5 mice).

See this image and copyright information in PMC

References

1. McDonald R. J., White N. M., A triple dissociation of memory systems: Hippocampus, amygdala, and dorsal striatum. Behav. Neurosci. 107, 3–22 (1993). - PubMed
1. Kim J. J., Baxter M. G., Multiple brain-memory systems: The whole does not equal the sum of its parts. Trends Neurosci. 24, 324–330 (2001). - PubMed
1. Packard M. G., Knowlton B. J., Learning and memory functions of the basal ganglia. Annu. Rev. Neurosci. 25, 563–593 (2002). - PubMed
1. Gold P. E., Coordination of multiple memory systems. Neurobiol. Learn. Mem. 82, 230–242 (2004). - PubMed
1. Squire L. R., Memory systems of the brain: A brief history and current perspective. Neurobiol. Learn. Mem. 82, 171–177 (2004). - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Selective engagement of prefrontal VIP neurons in reversal learning

Affiliations

Selective engagement of prefrontal VIP neurons in reversal learning

Authors

Affiliations

Abstract

Figures

Similar articles

References

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Abstract

Figures

Similar articles

References

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources