Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 30;12(1):13107.
doi: 10.1038/s41598-022-16407-9.

Stochastic consolidation of lifelong memory

Affiliations

Stochastic consolidation of lifelong memory

Nimrod Shaham et al. Sci Rep. .

Abstract

Humans have the remarkable ability to continually store new memories, while maintaining old memories for a lifetime. How the brain avoids catastrophic forgetting of memories due to interference between encoded memories is an open problem in computational neuroscience. Here we present a model for continual learning in a recurrent neural network combining Hebbian learning, synaptic decay and a novel memory consolidation mechanism: memories undergo stochastic rehearsals with rates proportional to the memory's basin of attraction, causing self-amplified consolidation. This mechanism gives rise to memory lifetimes that extend much longer than the synaptic decay time, and retrieval probability of memories that gracefully decays with their age. The number of retrievable memories is proportional to a power of the number of neurons. Perturbations to the circuit model cause temporally-graded retrograde and anterograde deficits, mimicking observed memory impairments following neurological trauma.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
(a) Pure forgetting. A memory efficacy trajectory as a function of time (solid line). The critical efficacy Ac is plotted as a dashed line. (b) Overlap of the network state with a memory state as a function of the memory age. The overlap is a measure of memory retrievability—after initializing the network near a memory state, the overlap of the nearby attractor network activity will be close to unity for retrievable memories and small compared to one for irretrievable memories. Here N=8000, f=0.01, τ=2240. The catastrophic age here is 1.73τ, resulting in a capacity (number of retrievable memories) of  0.5N. Note the very large value of τ needed to support this capacity—this will be addressed in later sections.
Figure 2
Figure 2
Stochastic memory dynamics. (a) Blue: basin of attraction size F as a function of memory efficacy A. Orange dashed: A/τ, the negative of the deterministic decay term in Eq. (7). Importantly, F is zero for A<Ac. Here λτ=5, Ac=0.4. (b) Example memory efficacies vs. age of the system. Memories enter with efficacy A(0)=1, rehearsal efficacy b=0.3. Most of them increase towards Afpbλτ1.5, and fluctuate around it. Large enough fluctuations can take efficacies below Ac (e.g, cyan curve at age/τ40, yellow curve at age/τ120). Some memories are alive for a very short time (e.g., green curve) and some for very long (e.g., red, blue curves). (c) Distribution of memory efficacies after saturation of Ac. (d) Equilibrium values of Ac as a function of bλτ for different λτ values. Here τ=160,N=8000.
Figure 3
Figure 3
(a, b) The forgetting curve. The probability of retrieval as a function of memory age. Blue dots: full network simulation results (see “Methods”). Red solid lines: results of a mean field approximation (“Methods”). An exponential fit with characteristic decay time 18τ is shown in green (dash-dot line) in (a), and a double exponential fit with characteristic decay times τ and 38τ in (b). The retrieval probability for pure forgetting is shown in black (dashed line). In (a) N=8000, τ=160, λτ=5, b=0.3. In (b) same parameters as (a) except b=0.25,λτ=10. (c) Blue (left y axis): Consolidation probability vs. bλτ for different λτ values. Green (right y axis): Consolidation time τc normalized by synaptic decay time τ vs. bλτ for different λτ values. Here τ=160. (d) Consolidation time τc normalized by synaptic decay time τ vs. τ for different λτ values. Blue curve: λτ=5,b=0.3. Green curve: λτ=10,b=0.25.
Figure 4
Figure 4
Memory capacity. (a) The number of retrievable memories divided by N as a function of bλτ for different average number of rehearsals per characteristic decay time (λτ) values. The dashed line shows the capacity of the pure forgetting model. Here N=8000,τ=160. (b) The number of retrievable memories divided by N as a function of τ for different λτ values. (c) Capacity vs. N (logarithmic, base 10), solid lines show simulation results, dashed lines are analytical approximation. Here τ=160,b=0.3. (d) The power of N vs. λτ (black), and the analytical approximation (red). Here τ=160,b=0.3.
Figure 5
Figure 5
Initial condition distribution. (a) Here A(0) for each memory is drawn from an exponential distribution with unity mean. Blue points are values for single memories, and the red line shows the mean. Note that the spread in lifetimes at each encoding strength is a result of the stochastic rehearsal process, which yields an exponential distribution of lifetimes and is present also in the uniform A(0) case. (b) Consolidation probability vs. a0, which is the initial efficacy of half of the memories (the others have A(0)=1). Values for memories introduced with A(0)=1 are shown in blue, and for memories introduced with A(0)=a0 are shown in red. (c) Memory mean lifetime (time from insertion to forgetting) as a function of a0. Same scenario and coloring as in (a). Dashed lines are averaged lifetimes of consolidated memories only. Parameters: N=8000,f=0.01,λτ=5,b=0.3
Figure 6
Figure 6
Perturbations and memory deficits. (a) The ratio between the capacity with and without injected noise vs. the diffusion coefficient D. (b) Retrieval probability vs. memory age with noisy synaptic dynamics (D=6). Noise onset was before: 5τ (green), 10τ (blue), 20τ (purple), 40τ (brown). The control (black) is simulated with noiseless dynamics. (c) The ratio between the capacity with and without synaptic dilution vs. the silenced synapses fraction p. (d) Retrieval probability vs. memory age for random synaptic dilution (p=0.1). Coloring as in (b). (e) Same as (c), but with p=0.2. Memories of all ages are affected, with some non-monotonicity caused by the small efficacies of newly learned memories, dropping more easily below Ac. (f) Combination of synaptic dilution and noisy synaptic dynamics, D=6 and p=0.1. Coloring as in (b). Parameters: N=8000,τ=160,λτ=10,b=0.25
Figure 7
Figure 7
Effect of threshold adaptation. Blue bars show capacity with threshold optimized for the noiseless case (θ0=0.36). Green bar shows capacity with threshold optimized for low dilution (p=0.1, θ1=0.31). Red bar shows capacity for threshold optimized for high dilution (p=0.2, θ2=0.29). Parameters: N=8000,τ=160,λτ=10,b=0.25
Figure 8
Figure 8
Power-law synaptic decay characteristic time distribution. (a) Retrieval probability as a function of memory age on log-log scale (blue), with power-law fit in black (dashed line, slope1). Here λτ=5 for the empirical mean τ, b=0.25, the power α=1.5. The minimal τ is 20. (b) Goodness of fit (R2) for the forgetting curve using an exponential function (orange, dash-dot) and power-law function (blue), as a function of the power parameter of the characteristic time distribution.

Similar articles

Cited by

References

    1. Rubin DC. On the retention function for autobiographical memory. J. Verbal Learn. Verbal Behav. 1982;21:21–38. doi: 10.1016/S0022-5371(82)90423-6. - DOI
    1. Rubin DC, Schulkind MD. The distribution of autobiographical memories across the lifespan. Mem. Cogn. 1997;25:859–866. doi: 10.3758/BF03211330. - DOI - PubMed
    1. Meeter M, Murre JM, Janssen SM. Remembering the news: Modeling retention data from a study with 14,000 participants. Mem. Cogn. 2005;33:793–810. doi: 10.3758/BF03193075. - DOI - PubMed
    1. Averell L, Heathcote A. The form of the forgetting curve and the fate of memories. J. Math. Psychol. 2011;55:25–35. doi: 10.1016/j.jmp.2010.08.009. - DOI
    1. Hopfield JJ. Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. 1982;79:2554–2558. doi: 10.1073/pnas.79.8.2554. - DOI - PMC - PubMed

Publication types