Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun 21;18(6):e1010233.
doi: 10.1371/journal.pcbi.1010233. eCollection 2022 Jun.

Sequence learning, prediction, and replay in networks of spiking neurons

Affiliations

Sequence learning, prediction, and replay in networks of spiking neurons

Younes Bouhadjar et al. PLoS Comput Biol. .

Abstract

Sequence learning, prediction and replay have been proposed to constitute the universal computations performed by the neocortex. The Hierarchical Temporal Memory (HTM) algorithm realizes these forms of computation. It learns sequences in an unsupervised and continuous manner using local learning rules, permits a context specific prediction of future sequence elements, and generates mismatch signals in case the predictions are not met. While the HTM algorithm accounts for a number of biological features such as topographic receptive fields, nonlinear dendritic processing, and sparse connectivity, it is based on abstract discrete-time neuron and synapse dynamics, as well as on plasticity mechanisms that can only partly be related to known biological mechanisms. Here, we devise a continuous-time implementation of the temporal-memory (TM) component of the HTM algorithm, which is based on a recurrent network of spiking neurons with biophysically interpretable variables and parameters. The model learns high-order sequences by means of a structural Hebbian synaptic plasticity mechanism supplemented with a rate-based homeostatic control. In combination with nonlinear dendritic input integration and local inhibitory feedback, this type of plasticity leads to the dynamic self-organization of narrow sequence-specific subnetworks. These subnetworks provide the substrate for a faithful propagation of sparse, synchronous activity, and, thereby, for a robust, context specific prediction of future sequence elements as well as for the autonomous replay of previously learned sequences. By strengthening the link to biology, our implementation facilitates the evaluation of the TM hypothesis based on experimentally accessible quantities. The continuous-time implementation of the TM algorithm permits, in particular, an investigation of the role of sequence timing for sequence learning, prediction and replay. We demonstrate this aspect by studying the effect of the sequence speed on the sequence learning performance and on the speed of autonomous sequence replay.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Sketch of the task and the learning protocol.
A) The neuronal network model developed in this study learns and processes sequences of ordered discrete elements, here represented by characters “A”, “B”, “C”, …. Sequence elements may constitute arbitrary discrete items, such as musical notes, numbers, or images. The order of sequence elements represents the temporal order of item occurrence. B) After repeated, consistent presentation of sets of high-order sequences, i.e., sequences with overlapping characters (here, {A, D, B, E} and {F, D, B, C}), the model learns to predict subsequent elements in response to the presentation of other elements (blue arrows) and to detect unanticipated elements by generating a mismatch signal if the prediction is not met (red arrows and flash symbols). The learning process is continuous and unsupervised. At the beginning of the learning process, all presented elements are unanticipated and hence trigger the generation of a mismatch signal. The learning progress is monitored and quantified by the prediction error (see Task performance measures).
Fig 2
Fig 2. Sketch of the network structure.
A) The architecture constitutes a recurrent network of excitatory and inhibitory neurons. Excitatory neurons are stimulated by external sources providing sequence-element specific inputs “A”,“D”, etc. The excitatory neuron population is composed of subpopulations containing neurons with identical stimulus preference (gray circles). Connections between and within the excitatory subpopulations are random and sparse. Inhibitory neurons are mutually unconnected. Each neuron in the inhibitory population is recurrently connected to a specific subpopulation of excitatory neurons. B) Initial connectivity matrix for excitatory connections to excitatory neurons (EE connections). Target and source neurons are grouped into stimulus-specific subpopulations (“A”,…,“F”). Before learning, the excitatory neurons are sparsely and randomly connected via immature synapses (light gray dots). C) During learning, sequence specific, sparsely connected subnetworks with mature synapses are formed (light blue arrows: {A, D, B, E}, dark blue arrows: {F, D, B, C}). D) EE connectivity matrix after learning. During the learning process, subsets of connections between subpopulations corresponding to subsequent sequence elements become mature and effective (light and dark blue dots). Mature connections are context specific (see distinct connectivity between subpopulations “D” and “B” corresponding to different sequences), thereby providing the backbone for a reliable propagation of sequence-specific activity. In panels B and D, only 5% of sequence non-specific EE connections are shown for clarity. Dark gray dots in panel D correspond to mature connections between neurons that remain silent after learning. For details on the network structure, see Tables 1 and 2.
Fig 3
Fig 3. Effect of dendritic action potentials (dAP) on the firing response to an external stimulus.
Membrane-potential responses to an external input (blue arrow, A), a strong dendritic input (brown arrow, B) triggering a dAP, and a combination of both (C). Black and gray vertical bars mark times of excitatory and inhibitory spikes, respectively. The horizontal dashed line marks the spike threshold θE. The horizontal light blue lines depict the dAP plateau. D) Magnified view of spike times from panels A and C. A dAP preceding the external input (as in panel C) can speed up somatic, and hence, inhibitory firing, provided the time interval between the dAP and the external input is in the right range. The excitatory neuron is connected bidirectionally to an inhibitory neuron (see sketch on the right).
Fig 4
Fig 4. Homeostatic regulation of the spike-timing-dependent structural plasticity by the dAP activity.
Evolution of the synaptic permanence (gray) and weight (black) during repetitive presynaptic-postsynaptic spike pairing for different levels of the dAP activity. In the depicted example, presynaptic spikes precede the postsynaptic spikes by 40 ms for each spike pairing. Consecutive spike pairs are separated by a 200 ms interval. In each panel, the postsynaptic dAP trace is clamped at a different value: z = 0 (left), z = 1 (middle), z = 2 (right). The dAP target activity is fixed at z* = 1. The horizontal dashed and dotted lines mark the maximum permanence Pmax and the maturity threshold θP, respectively.
Fig 5
Fig 5. Existence of divergent-convergent connectivity motifs in a random network.
A) Sketch of the divergent-convergent potential connectivity motif required for the formation of sequence specific subnetworks during learning. See main text for details. B) Dependence of the motif probability u on the connection probability p for nE = 150, c = 5, and ρ = 20 (see Table 2). The dotted vertical line marks the potential connection probability p = 0.2 used in this study.
Fig 6
Fig 6. Context specific predictions.
Sketches (left column) and raster plots of network activity (right column) before (top row) and after learning of the two sequences {A, D, B, E} and {F, D, B, C} (middle and bottom rows). In the left column, large light gray circles depict the excitatory subpopulations (same arrangement as in Fig 2). Red, blue and gray circles mark active, predictive and silent neurons, respectively. In the right column, red dots and blue lines mark somatic spikes and dAP plateaus, respectively. Type and timing of presented stimuli are depicted by black arrows. A,B) Snapshots of network activity upon subsequent presentation of the sequence elements “A” and “D” (panel A), and network activity in response to presentation of the entire sequence {A, D, B, E} (panel B) before learning. All neurons in the stimulated subpopulations become active. C,D) Same as panels A and B, but after learning. Presenting the first element “A” causes all neurons in the corresponding subpopulations to fire. Activation of these neurons triggers dAPs (predictions) in a subset of neurons representing the subsequent element “D”. When the next element “D” is presented, only these predictive neurons become active, leading to predictions in the subpopulation representing the subsequent subpopulation (“B”), etc. E,F) Same as panels C and D, but for sequence {F, D, B, C}. The subsets of neurons representing “D” and “B” activated during sequences {A, D, B, E} and {F, D, B, C} are distinct, i.e., context specific. For clarity, panels B, D, and F show only a fraction of excitatory neurons (30%).
Fig 7
Fig 7. dAP-rate homeostasis enhances context specificity.
A) Sketch of subpopulations of excitatory neurons representing the elements of the two sequences {F, D, B} and {A, D, B}, depicted by light and dark blue colors, respectively. Before learning, the connections between the subpopulations are immature (gray lines). Hence, for each element presentation, all neurons in the respective subpopulations fire (filled circles).B) Hebbian plasticity drives the formation of mature connections between subpopulations representing successive sequence elements (colored lines), and leads to sparse firing. The sets of neurons contributing to the two sequences partly overlap. C) Incorporating dAP-rate homeostasis reduces this overlap in the activation patterns.
Fig 8
Fig 8. Sequence prediction performance for sequence set I.
Dependence of the sequence prediction error (A), the false-positive and false-negative rates (B), and the number of active neurons relative to the subpopulation size (C) on the number of training episodes during repetitive stimulation with sequence set I (see Task and training protocol). Curves and error bands indicate the median as well as the 5% and 95% percentiles across an ensemble of 5 different network realizations, respectively. All prediction performance measures are calculated as a moving average over the last 4 training episodes. The dashed gray horizontal line in panel C depicts the target sparsity level ρ/(LnE). Inter-stimulus interval ΔT = 40 ms. See Table 2 for remaining parameters.
Fig 9
Fig 9. Sequence prediction performance for sequence set II and comparison with original model.
Same figure arrangement, training and measurement protocol as in Fig 8. Data obtained during repetitive stimulation of the network with sequence set II (see Task and training protocol). Gray curves depict results obtained using the original (non-spiking) TM model from [14] with adapted parameters (see S1 Table). The dashed gray horizontal line in panel C depicts the target sparsity level ρ/(LnE).
Fig 10
Fig 10. Effect of sequence speed on network performance.
Dependence of the sequence prediction error, the learning speed (episodes-to-solution; A), the false-positive and false-negative rates (B), and the number of active neurons relative to the subpopulation size (C) on the inter-stimulus interval ΔT after 100 training episodes. Curves and error bands indicate the median as well as the 5% and 95% percentiles across an ensemble of 5 different network realizations, respectively. Same task and network as in Fig 8.
Fig 11
Fig 11. Sequence replay dynamics and speed.
Autonomous replay of the sequences {A, D, B, E} (A) and {F, D, B, C} (B), initiated by stimulating the subpopulations “A” and “F”, respectively. Red dots and blue lines mark somatic spikes and dAP plateaus, respectively, for a fraction of neurons (30%) within each subpopulation. During learning, the inter-stimulus interval ΔT is set to 40 ms. C) Dependence of the sequence replay duration on the inter-stimulus interval ΔT during learning. Replay duration is measured as the difference between the mean firing times of the populations representing the first and last elements in a given sequence. Gray areas mark regions with low prediction performance (see Dependence of prediction performance on the sequence speed). Error bands represent the mean ± standard deviation of the prediction error across 5 different network realizations. Same network and training set as in Fig 8.

References

    1. Lashley KS. The problem of serial order in behavior. vol. 21. Bobbs-Merrill; 1951.
    1. Hawkins J, Blakeslee S. On intelligence: How a new understanding of the brain will lead to the creation of truly intelligent machines. Macmillan; 2007.
    1. Dehaene S, Meyniel F, Wacongne C, Wang L, Pallier C. The neural representation of sequences: from transition probabilities to algebraic patterns and linguistic trees. Neuron. 2015;88(1):2–19. doi: 10.1016/j.neuron.2015.09.019 - DOI - PubMed
    1. Clegg BA, DiGirolamo GJ, Keele SW. Sequence learning. Trends Cogn Sci. 1998;2(8):275–281. doi: 10.1016/S1364-6613(98)01202-9 - DOI - PubMed
    1. Gavornik JP, Bear MF. Learned spatiotemporal sequence recognition and prediction in primary visual cortex. Nat Neurosci. 2014;17(5):732. doi: 10.1038/nn.3683 - DOI - PMC - PubMed

Publication types