Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun 10;17(6):e1009017.
doi: 10.1371/journal.pcbi.1009017. eCollection 2021 Jun.

The functional role of sequentially neuromodulated synaptic plasticity in behavioural learning

Affiliations

The functional role of sequentially neuromodulated synaptic plasticity in behavioural learning

Grace Wan Yu Ang et al. PLoS Comput Biol. .

Abstract

To survive, animals have to quickly modify their behaviour when the reward changes. The internal representations responsible for this are updated through synaptic weight changes, mediated by certain neuromodulators conveying feedback from the environment. In previous experiments, we discovered a form of hippocampal Spike-Timing-Dependent-Plasticity (STDP) that is sequentially modulated by acetylcholine and dopamine. Acetylcholine facilitates synaptic depression, while dopamine retroactively converts the depression into potentiation. When these experimental findings were implemented as a learning rule in a computational model, our simulations showed that cholinergic-facilitated depression is important for reversal learning. In the present study, we tested the model's prediction by optogenetically inactivating cholinergic neurons in mice during a hippocampus-dependent spatial learning task with changing rewards. We found that reversal learning, but not initial place learning, was impaired, verifying our computational prediction that acetylcholine-modulated plasticity promotes the unlearning of old reward locations. Further, differences in neuromodulator concentrations in the model captured mouse-by-mouse performance variability in the optogenetic experiments. Our line of work sheds light on how neuromodulators enable the learning of new contingencies.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Inactivating cholinergic neurons affects reversal learning.
(A) Schematic of the task paradigm. Mice were first trained to locate a baited food well in an open-field (initial learning stage). After 8 days of training, the location of the baited well was shifted to the opposite quadrant, and training proceeded for another 12 days (reversal learning stage). Mice received 10 trials each day. (B) Learning performance across days, averaged over number of mice in each group (GFP control, n = 8; Light-off control, n = 16; Light-on, n = 21). Error bars show SEM. GFP mice were tested in a separate cohort of mice with 5 light-on (ACh-suppressed) mice, and under-performed slightly in the initial learning stage. However their performance was similar to light-off controls in the reversal learning stage, which indicated they had successfully acquired the task. (C) Number of days taken for mice to reach and maintain an 80% success rate.
Fig 2
Fig 2. Reducing acetylcholine in the sn-Plast model qualitatively accounts for the behavioural data.
(A) The sn-Plast learning rule governing synaptic weight changes in the model. STDP changes synaptic weights (W) as a function of the time difference between pre- and postsynaptic spikes (Δt). The STDP windows are symmetric, and the sign of the weight change is determined by the neuromodulator. Acetylcholine is present during exploration, biasing STDP towards synaptic depression. When a reward is encountered, a phasic dopaminergic signal is released, which potentiates active synapses through an eligibility trace. The model consists of a one-layer network of place cells, representing the agent’s position, projecting to a ring network of recurrently connected action neurons coding for the direction taken by the agent. Connections between action neurons with similar tuning are excitatory (blue), but are inhibitory otherwise (red). The weights between place cells and action cells are modified according to the sn-Plast learning rule. (B) Learning rate parameters which control the STDP window amplitude. (C) Reducing acetylcholine in the model (ηACh = 0.000345 to ηACh = 0.000184, at ηDA = 0.00115) impairs reversal learning, reproducing learning curves similar to group performance of control and light-on mice as shown in Fig 1B. (D) Policy preference map at different stages of the task for parameters used in C. Blue filled circle indicates the location of the reward in the open maze. Vector fields (by averaging the synaptic weights from each place cell to the action neurons) represent the agent’s policy preference map across days. The effect of reducing acetylcholine in the model is evident during the early phase of reversal learning (days 4 and 8 shown); reducing acetylcholine slows unlearning of the old reward location.
Fig 3
Fig 3. Heterogeneity in learning across mice.
(A) Mixed-effects logistic regression fit to the experimental data. The five regressors used to predict the probability of a mouse locating the reward on each trial: group type (GFP, light-off or light-on), stage (initial learning or reversal learning), trial number, and interactions between the variables. Unique slopes and intercepts were estimated for all mice, which produced individual predictions shown in B. Asterisks indicate significant terms. (B) Estimated probability of individual mice locating the correct well on each day, after fitting the mixed-effects logistic regression to the data. (C) Types of learning behaviours in mice. Some mice were slower in the initial learning stage (≥ 6 days to attain 80%) than they were at reversal learning. Others were slower in the reversal learning stage (≥ 8 days to attain 80%) than they were at initial learning. There were also mice that performed consistently well (fast learning and reversal) or poorly (slow learning and reversal) across the two task stages. (D) Examples of model fits to individual mice. The sn-Plast model was fit to each mouse by comparing the RMSE of the percentage of correct trials across days between the mouse (filled circles) and the agent. This was repeated for each iteration of the model, producing 100 parameter estimates for each mouse. Simulated behavioural data across the 100 best fit estimates were then averaged to yield the performance curve (overlaid line). Error bars represent SEM. (inset) Final parameter estimate (x-coordinate, ηDA; y-coordinate, ηACh). Colours of the subject labels indicate the type of learning behaviour as described in C.

Similar articles

Cited by

References

    1. Bliss TVP, Lømo T. Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path. The Journal of Physiology. 1973;232(2):331–356. doi: 10.1113/jphysiol.1973.sp010273 - DOI - PMC - PubMed
    1. Bliss TVP, Collingridge GL. A synaptic model of memory: long-term potentiation in the hippocampus. Nature. 1993;361(6407):31–39. doi: 10.1038/361031a0 - DOI - PubMed
    1. Gerstner W, Kempter R, van Hemmen JL, Wagner H. A neuronal learning rule for sub-millisecond temporal coding. Nature. 1996;383(6595):76–78. doi: 10.1038/383076a0 - DOI - PubMed
    1. Markram H, Lübke J, Frotscher Michael, Sakmann B. Regulation of synaptic efficacy by coincidence of postsynaptic APs and EPSPs. Science. 1997;275(5297):213–215. doi: 10.1126/science.275.5297.213 - DOI - PubMed
    1. Bi Gq, Poo Mm. Synaptic Modifications in Cultured Hippocampal Neurons: Dependence on Spike Timing, Synaptic Strength, and Postsynaptic Cell Type. The Journal of Neuroscience. 1998;18(24):10464–10472. doi: 10.1523/JNEUROSCI.18-24-10464.1998 - DOI - PMC - PubMed

Publication types

Substances

LinkOut - more resources