Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 15:17:935177.
doi: 10.3389/fnint.2023.935177. eCollection 2023.

Toward reproducible models of sequence learning: replication and analysis of a modular spiking network with reward-based learning

Affiliations

Toward reproducible models of sequence learning: replication and analysis of a modular spiking network with reward-based learning

Barna Zajzon et al. Front Integr Neurosci. .

Abstract

To acquire statistical regularities from the world, the brain must reliably process, and learn from, spatio-temporally structured information. Although an increasing number of computational models have attempted to explain how such sequence learning may be implemented in the neural hardware, many remain limited in functionality or lack biophysical plausibility. If we are to harvest the knowledge within these models and arrive at a deeper mechanistic understanding of sequential processing in cortical circuits, it is critical that the models and their findings are accessible, reproducible, and quantitatively comparable. Here we illustrate the importance of these aspects by providing a thorough investigation of a recently proposed sequence learning model. We re-implement the modular columnar architecture and reward-based learning rule in the open-source NEST simulator, and successfully replicate the main findings of the original study. Building on these, we perform an in-depth analysis of the model's robustness to parameter settings and underlying assumptions, highlighting its strengths and weaknesses. We demonstrate a limitation of the model consisting in the hard-wiring of the sequence order in the connectivity patterns, and suggest possible solutions. Finally, we show that the core functionality of the model is retained under more biologically-plausible constraints.

Keywords: modularity; reproducibility; reward-based learning; sequence learning model; spiking networks.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Sequence learning task and network architecture. (A) A sequence of three intervals (elements) is learned by a network with as many dedicated populations (columns). The individual populations are stimulated sequentially, with a global reward signal given at the beginning and the end of each element. After training, the recurrent and feedforward weights are strengthened, and the sequence is successfully recalled following a cue. The fullness of the colored sections on the right illustrates the duration of the activity (firing rates) above a certain threshold. (B) Each stimulus-specific column is composed of two excitatory, Timers (T) and Messengers (M), and two corresponding inhibitory populations, IT and IM. Solid (dashed) arrows represent fixed static (plastic) connections. Cross-columnar inhibition always targets the excitatory population in the corresponding layer (L5 or L2/3). (C) Firing rates of the excitatory populations during learning (top three plots) and recall (bottom plot) of four time intervals (500; 1,000; 700; and 1,800ms). Light (dark) colors represent T (M) cells. Dashed light blue curve in top panel inset shows the inhibitory population IT in L5. Green (gray) vertical bars show the 25ms reward (trace refractory) period, 25ms after stimulus offset (see inset). (D) Spiking activity of excitatory cells (top) and corresponding ISI distributions (bottom), during recall, for the network in (C). In the raster plot, neurons are sorted by population (T, M) and sequentially by column (see color coding on the right).
Figure 2
Figure 2
Accuracy of recall and evolution of learning. Results shown for a sequence of four intervals of 700ms. (A) Fluctuations in learning and sequence recall. We define recall time as the time at which the rate of the Timer population drops below 10spks/sec. Left: recall times for 30 trials after learning, for one network instance. Right: distribution of the median recall times over 10 network instances, with the median in each network calculated over 30 replay trials. (B) Mean synaptic weights for feedforward (Messenger to Timer in subsequent columns, top) and recurrent (Timer to Timer in the same column, bottom) connections for one network instance. (C) Mean LTP and LTD traces for the recurrent (top) and feedforward (bottom) connections, for learning trials T = 3, T = 15, and T = 35 and one network instance.
Figure 3
Figure 3
Robustness to variation in synaptic weights and learning parameters. The system was trained on a sequence of four elements, each with a duration of 700ms. For the Timer cells, we define relative recall time as the recall time relative to stimulation onset, i.e., the time from the expected onset time (0; 7,00; 1,400; 2,100) in the sequence until the rate drops below a threshold of 10spks/sec. Conversely, absolute recall time is simply the time when the rate drops below threshold (relative to 0). (A) Number of outlier intervals reported during 50 recall trials, as a function of the percentage change of two synaptic weights within a column: excitatory Timer to Messenger, and inhibitory IT to Messenger. Top row shows the number of outliers, defined as a deviation of ±140 ms from the correct interval relative to expected onset (left), and the number of outliers detected using a modified z-score (threshold >3, right panel) based on the median absolute deviation in column C4 (see main text). Bottom row shows the respective outliers averaged over all four columns. (B) Deviation of the median recall time from the expected 700ms, as a function of the excitatory and inhibitory synaptic weights onto the Messenger cells in a column (left), and as a function of the cross-columnar (CiCj) inhibitory synaptic weights within the same layers (right). Top and bottom row as in (A). All data in (A, B) is averaged over 20 network instances. (C) Mean recall time of a four-element sequence of 700ms intervals, over 50 recall trials of a single network instance. Left: baseline network. Center: during each training trial, the learning parameters (see main text) are drawn randomly and independently from a distribution of ±20% around their baseline value. Error bars represent the standard deviation. Right: the set of learning parameters is drawn randomly once for each network instance, with data shown averaged over 10 instances.
Figure 4
Figure 4
Activity of L5 inhibitory population is critical for accurate learning. (A) Deviation of the median recall time of three intervals of 700ms, as a function of the change in synaptic weights TIT relative to baseline (Δw = 0). Gray area (<−25%) marks region where learning is unstable (not all elements can be recalled robustly). Data is averaged over five network instances. (B–D) Characteristic firing rates during recall for values deviations of −25, 0, and 40% relative to baseline. Solid curves represent the excitatory populations as in Figure 1, while dashed curves indicate the respective inhibitory populations IT in C_i.
Figure 5
Figure 5
Scaling the model requires manual retuning of parameters. (A) Characteristic firing rates during training (top) and recall (bottom) of a sequence composed of three 700ms intervals, in a larger network where each population is composed of N′ = 400 cells. All static weights have been scaled down by 1/N/N (see Methods). Solid curves show Timer (light) and Messenger (dark) cells, dashed curves IT cells. (B) As in (A), with further manual tuning of specific weights. For details, see Section 4 and Supplementary material.
Figure 6
Figure 6
All-to-all cross-columnar excitation prohibits learning. (A) Extending the original architecture described in Figure 1B, MT connections exist between all columns CiCj (ij) and are subject to the same plasticity. (B) Firing rates of the excitatory populations during learning and recall of four time intervals (each 700 ms). Initially, learning evolves as in Figure 1C, but the activity becomes degenerated and the sequence can not be recalled correctly (lower panels). (C) Evolution of the cross-columnar (from C2, top panel) and recurrent Timer synaptic weights (bottom panel). The transition to the next sequence cannot be uniquely encoded as the weights to all columns are strengthened. (D) Sequence recall after 100 training trials in a network with a low background noise (50% of the baseline value, 1/2σξ). (E) Sequence recall after 100 training trials in a network with a higher Hebbian activation threshold for the cross-columnar projections rthff=30spks/sec (instead of the baseline 20spks/sec).
Figure 7
Figure 7
Alternative wiring with local inhibition and only excitatory cross-columnar projections. (A) Architecture with local inhibition functionally equivalent to Figure 1B. Inhibitory projections are now local to the column, and feedforward inhibition is achieved via cross-columnar excitatory projections onto the I populations. (B) Recall of a sequence composed of two 700ms intervals. Inset (bottom panel) zooms in on the activity at lower rates. As before, color codes for columns. Color shade represents populations in L5 (light) and L2/3 (dark), with solid curves denoting excitatory populations. Dashed (dotted) curves represent the inhibitory cells IT (IM).

References

    1. Arieli A., Sterkin A., Grinvald A., Aertsen A. (1996). Dynamics of ongoing activity: explanation of the large variability in evoked cortical responses. Science 273, 1868–1871. 10.1126/science.273.5283.1868 - DOI - PubMed
    1. August D., Levy W. (1999). Temporal sequence compression by an integrate-and-fire model of hippocampal area CA3. J. Comput. Neurosci. 6, 71–90. 10.1023/A:1008861001091 - DOI - PubMed
    1. Benureau F. C. Y., Rougier N. P. (2018). Re-run, repeat, reproduce, reuse, replicate: transforming code into scientific contributions. Front. Neuroinform. 11:69. 10.3389/fninf.2017.00069 - DOI - PMC - PubMed
    1. Bitterman Y., Mukamel R., Malach R., Fried I., Nelken I. (2008). Ultra-fine frequency tuning revealed in single neurons of human auditory cortex. Nature 451, 197–201. 10.1038/nature06476 - DOI - PMC - PubMed
    1. Bouhadjar Y., Wouters D. J., Diesmann M., Tetzlaff T. (2021). Sequence learning, prediction, and replay in networks of spiking neurons. arXiv:2111.03456. 10.1371/journal.pcbi.1010233 - DOI - PMC - PubMed

LinkOut - more resources