. 2021 Jun 10;17(6):e1009017.

doi: 10.1371/journal.pcbi.1009017. eCollection 2021 Jun.

The functional role of sequentially neuromodulated synaptic plasticity in behavioural learning

Grace Wan Yu Ang¹, Clara S Tang², Y Audrey Hay², Sara Zannone¹, Ole Paulsen², Claudia Clopath¹

Affiliations

¹ Department of Bioengineering, Imperial College London, South Kensington Campus, London, United Kingdom.
² Department of Physiology, Development and Neuroscience, Physiological Laboratory, Cambridge, United Kingdom.

PMID: 34111110
PMCID: PMC8192019
DOI: 10.1371/journal.pcbi.1009017

The functional role of sequentially neuromodulated synaptic plasticity in behavioural learning

Grace Wan Yu Ang et al. PLoS Comput Biol. 2021.

. 2021 Jun 10;17(6):e1009017.

doi: 10.1371/journal.pcbi.1009017. eCollection 2021 Jun.

Authors

Grace Wan Yu Ang¹, Clara S Tang², Y Audrey Hay², Sara Zannone¹, Ole Paulsen², Claudia Clopath¹

Affiliations

¹ Department of Bioengineering, Imperial College London, South Kensington Campus, London, United Kingdom.
² Department of Physiology, Development and Neuroscience, Physiological Laboratory, Cambridge, United Kingdom.

PMID: 34111110
PMCID: PMC8192019
DOI: 10.1371/journal.pcbi.1009017

Abstract

To survive, animals have to quickly modify their behaviour when the reward changes. The internal representations responsible for this are updated through synaptic weight changes, mediated by certain neuromodulators conveying feedback from the environment. In previous experiments, we discovered a form of hippocampal Spike-Timing-Dependent-Plasticity (STDP) that is sequentially modulated by acetylcholine and dopamine. Acetylcholine facilitates synaptic depression, while dopamine retroactively converts the depression into potentiation. When these experimental findings were implemented as a learning rule in a computational model, our simulations showed that cholinergic-facilitated depression is important for reversal learning. In the present study, we tested the model's prediction by optogenetically inactivating cholinergic neurons in mice during a hippocampus-dependent spatial learning task with changing rewards. We found that reversal learning, but not initial place learning, was impaired, verifying our computational prediction that acetylcholine-modulated plasticity promotes the unlearning of old reward locations. Further, differences in neuromodulator concentrations in the model captured mouse-by-mouse performance variability in the optogenetic experiments. Our line of work sheds light on how neuromodulators enable the learning of new contingencies.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Inactivating cholinergic neurons affects reversal learning.**
(A) Schematic of the task paradigm. Mice were first trained to locate a baited food well in an open-field (initial learning stage). After 8 days of training, the location of the baited well was shifted to the opposite quadrant, and training proceeded for another 12 days (reversal learning stage). Mice received 10 trials each day. (B) Learning performance across days, averaged over number of mice in each group (GFP control, n = 8; Light-off control, n = 16; Light-on, n = 21). Error bars show SEM. GFP mice were tested in a separate cohort of mice with 5 light-on (ACh-suppressed) mice, and under-performed slightly in the initial learning stage. However their performance was similar to light-off controls in the reversal learning stage, which indicated they had successfully acquired the task. (C) Number of days taken for mice to reach and maintain an 80% success rate.

**Fig 2. Reducing acetylcholine in the sn-Plast model qualitatively accounts for the behavioural data.**
(A) The sn-Plast learning rule governing synaptic weight changes in the model. STDP changes synaptic weights (W) as a function of the time difference between pre- and postsynaptic spikes (Δt). The STDP windows are symmetric, and the sign of the weight change is determined by the neuromodulator. Acetylcholine is present during exploration, biasing STDP towards synaptic depression. When a reward is encountered, a phasic dopaminergic signal is released, which potentiates active synapses through an eligibility trace. The model consists of a one-layer network of place cells, representing the agent’s position, projecting to a ring network of recurrently connected action neurons coding for the direction taken by the agent. Connections between action neurons with similar tuning are excitatory (blue), but are inhibitory otherwise (red). The weights between place cells and action cells are modified according to the sn-Plast learning rule. (B) Learning rate parameters which control the STDP window amplitude. (C) Reducing acetylcholine in the model (η_ACh = 0.000345 to η_ACh = 0.000184, at η_DA = 0.00115) impairs reversal learning, reproducing learning curves similar to group performance of control and light-on mice as shown in Fig 1B. (D) Policy preference map at different stages of the task for parameters used in C. Blue filled circle indicates the location of the reward in the open maze. Vector fields (by averaging the synaptic weights from each place cell to the action neurons) represent the agent’s policy preference map across days. The effect of reducing acetylcholine in the model is evident during the early phase of reversal learning (days 4 and 8 shown); reducing acetylcholine slows unlearning of the old reward location.

**Fig 3. Heterogeneity in learning across mice.**
(A) Mixed-effects logistic regression fit to the experimental data. The five regressors used to predict the probability of a mouse locating the reward on each trial: group type (GFP, light-off or light-on), stage (initial learning or reversal learning), trial number, and interactions between the variables. Unique slopes and intercepts were estimated for all mice, which produced individual predictions shown in B. Asterisks indicate significant terms. (B) Estimated probability of individual mice locating the correct well on each day, after fitting the mixed-effects logistic regression to the data. (C) Types of learning behaviours in mice. Some mice were slower in the initial learning stage (≥ 6 days to attain 80%) than they were at reversal learning. Others were slower in the reversal learning stage (≥ 8 days to attain 80%) than they were at initial learning. There were also mice that performed consistently well (fast learning and reversal) or poorly (slow learning and reversal) across the two task stages. (D) Examples of model fits to individual mice. The sn-Plast model was fit to each mouse by comparing the RMSE of the percentage of correct trials across days between the mouse (filled circles) and the agent. This was repeated for each iteration of the model, producing 100 parameter estimates for each mouse. Simulated behavioural data across the 100 best fit estimates were then averaged to yield the performance curve (overlaid line). Error bars represent SEM. (inset) Final parameter estimate (x-coordinate, η_DA; y-coordinate, η_ACh). Colours of the subject labels indicate the type of learning behaviour as described in C.

See this image and copyright information in PMC

Cited by

A Navigation Path Search and Optimization Method for Mobile Robots Based on the Rat Brain's Cognitive Mechanism.
Liao Y, Yu N, Yan J. Liao Y, et al. Biomimetics (Basel). 2023 Sep 14;8(5):427. doi: 10.3390/biomimetics8050427. Biomimetics (Basel). 2023. PMID: 37754178 Free PMC article.
[Spatial navigation method based on the entorhinal-hippocampal-prefrontal information transmission circuit of rat's brain].
Liao Y, Yu N. Liao Y, et al. Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Feb 25;41(1):80-89. doi: 10.7507/1001-5515.202303047. Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024. PMID: 38403607 Free PMC article. Chinese.

References

1. Bliss TVP, Lømo T. Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path. The Journal of Physiology. 1973;232(2):331–356. doi: 10.1113/jphysiol.1973.sp010273 - DOI - PMC - PubMed
1. Bliss TVP, Collingridge GL. A synaptic model of memory: long-term potentiation in the hippocampus. Nature. 1993;361(6407):31–39. doi: 10.1038/361031a0 - DOI - PubMed
1. Gerstner W, Kempter R, van Hemmen JL, Wagner H. A neuronal learning rule for sub-millisecond temporal coding. Nature. 1996;383(6595):76–78. doi: 10.1038/383076a0 - DOI - PubMed
1. Markram H, Lübke J, Frotscher Michael, Sakmann B. Regulation of synaptic efficacy by coincidence of postsynaptic APs and EPSPs. Science. 1997;275(5297):213–215. doi: 10.1126/science.275.5297.213 - DOI - PubMed
1. Bi Gq, Poo Mm. Synaptic Modifications in Cultured Hippocampal Neurons: Dependence on Spike Timing, Synaptic Strength, and Postsynaptic Cell Type. The Journal of Neuroscience. 1998;18(24):10464–10472. doi: 10.1523/JNEUROSCI.18-24-10464.1998 - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The functional role of sequentially neuromodulated synaptic plasticity in behavioural learning

Affiliations

The functional role of sequentially neuromodulated synaptic plasticity in behavioural learning

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Molecular Biology Databases

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Molecular Biology Databases