Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb 7;120(6):e2205211120.
doi: 10.1073/pnas.2205211120. Epub 2023 Jan 31.

Distinct replay signatures for prospective decision-making and memory preservation

Affiliations

Distinct replay signatures for prospective decision-making and memory preservation

G Elliott Wimmer et al. Proc Natl Acad Sci U S A. .

Abstract

Theories of neural replay propose that it supports a range of functions, most prominently planning and memory consolidation. Here, we test the hypothesis that distinct signatures of replay in the same task are related to model-based decision-making ("planning") and memory preservation. We designed a reward learning task wherein participants utilized structure knowledge for model-based evaluation, while at the same time had to maintain knowledge of two independent and randomly alternating task environments. Using magnetoencephalography and multivariate analysis, we first identified temporally compressed sequential reactivation, or replay, both prior to choice and following reward feedback. Before choice, prospective replay strength was enhanced for the current task-relevant environment when a model-based planning strategy was beneficial. Following reward receipt, and consistent with a memory preservation role, replay for the alternative distal task environment was enhanced as a function of decreasing recency of experience with that environment. Critically, these planning and memory preservation relationships were selective to pre-choice and post-feedback periods, respectively. Our results provide support for key theoretical proposals regarding the functional role of replay and demonstrate that the relative strength of planning and memory-related signals are modulated by ongoing computational and task demands.

Keywords: decision-making; hippocampus; memory; planning; replay.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interest.

Figures

Fig. 1.
Fig. 1.
Two-environment reward learning task and generalization behavior. (A) Task schematic showing two alternative worlds and their two equivalent start states. Trials in each world start at one of two equivalent shape pair options; to illustrate these connections, arrows from different states differ in color saturation. The shape options then lead deterministically to the same paths and reward outcomes (0 to 9 points (pts)). To learn this general structure, participants engaged in a training session before scanning. For the MEG scanning session, participants then learned two worlds populated with new images. Participants’ memory for the path sequences was at asymptote after an initial no-reward exploration period. (B) Key trial periods in the reward learning phase. Replay was measured prior to choice in a time window we subsequently refer to as the “planning period.” After the disappearance of a central cross, participants entered their response. Participants then sequentially viewed the state images corresponding to the chosen path. Finally, participants received reward feedback (0 to 9 points), the amount of which drifted across trials. Replay was again measured in the post-feedback time window. For interpretation of subsequent results, in this example, World 1 is the “current world,” while World 2 is the non-presented “other world.” (SI Appendix, Fig. S1.) (C) Example trial sequence, highlighting two cases where a trial either has a different start state or the same start state as the previous trial in the same world. (D) Illustration of the dependence of repeated choices (stay) on previous rewards, conditional on whether the start state in the current world was the same as in the previous trial or not. The plot depicts the probability of a stay choice (when participants repeated a previous path selection in a given world) following above-average (high) versus below-average (low) reward. This difference was equivalent for same (purple) versus different (orange) starting states, consistent with behavior being model-based (“n.s.” represents nonsignificant effects in a regression model using continuous reward data). For display purposes, graded point feedback was binarized into high and low and trials with near-mean feedback were excluded; alternative procedures yield the same qualitative results. Gray dots represent individual participants. (SI Appendix, Fig. S2.)
Fig. 2.
Fig. 2.
Training of state localizer and sequenceness time lag identification. See also SI Appendix, Fig. S3. (A) Classifier performance for path state stimuli presented during a pre-task localizer phase, training and testing at all time points. This revealed good discrimination between the 12 path stimuli used in the learning task. The color bar indicates predicted probability. Note that start state shape stimuli were not included in the pre-task localizer and are not included in sequenceness analyses. (B) Peak classifier performance from 140 to 210 ms after stimulus onset in the localizer phase (depicting the diagonal extracted from A). (C) Forward sequenceness for all learned paths during planning and feedback periods was evident at a common state-to-state lag of 70 ms in both trial periods. Open dots indicate time points exceeding a permutation significance threshold, which differs for the two periods. (D) Backward sequenceness for all learned paths during planning and feedback periods was evident at state-to-state lags that spanned 10 to 50 ms in feedback period alone. Note that the x-axis in the sequenceness panels indicates the lag between reactivations, derived as a summary measure across seconds; the axis does not represent time within a trial period. Open dots indicate time points exceeding a permutation significance threshold, which differs for the two periods. Shaded error margins represent SEM. See SI Appendix, Fig. S6 for example sequenceness events and SI Appendix, Fig. S5 for extended time lags.
Fig. 3.
Fig. 3.
Planning period replay increases and the benefit of model-based generalization. (A) Stronger forward replay on trials where the start state is different from the previous trial, where there is a greater benefit in utilizing model-based knowledge (B) Time course of regression coefficients for variables of interest, showing effects at state-to-state lags from 10 to 130 ms. The light blue line highlights the 70-ms time lag of interest shown in A. Y-axes represent sequenceness regression coefficients for binary different versus same start state. See SI Appendix, Fig. S5 for extended time lags. Seq, sequenceness. **P < 0.01; ***P < 0.001; +P < 0.01, corrected for multiple comparisons.
Fig. 4.
Fig. 4.
Feedback period backward replay increases with the rarity of recent other world experience. (A) Rarity (lower recent experience) of the other world correlated with greater backward replay of other world paths. (B) Time course of regression coefficients for the rarity effect of interest, showing effects at state-to-state lags from 10 to 130 ms. The light blue line highlights the 40 time lag of interest shown in A. Y-axes represent sequenceness regression coefficients for rarity of the other world. See SI Appendix, Fig. S5 for extended time lags. Seq, sequenceness; *P < 0.05. (C) Across-participant relationship between the replay-rarity effect and lower planning period forward replay (world change trials; P = 0.009).
Fig. 5.
Fig. 5.
Exploratory replay onset beamforming analyses. (A) In the planning period, beamforming analyses revealed power increases associated with replay onset in the right MTL, including the hippocampus. (B) After reward feedback, power increases associated with replay onset were found in the bilateral MTL, including the hippocampus. See also SI Appendix, Fig. S7 and Table S4. The y coordinate refers to the MNI atlas. For display, statistical maps were thresholded at P < 0.01 uncorrected; clusters significant at P < 0.05, whole-brain corrected using nonparametric permutation tests. For unthresholded statistical maps and results within the hippocampus ROI mask, see https://neurovault.org/collections/11163/.

Similar articles

Cited by

References

    1. Olafsdottir H. F., Bush D., Barry C., The role of hippocampal replay in memory and planning. Curr. Biol. CB 28, R37–R50 (2018). - PMC - PubMed
    1. Diba K., Buzsaki G., Forward and reverse hippocampal place-cell sequences during ripples. Nat. Neurosci. 10, 1241–1242 (2007). - PMC - PubMed
    1. Buzsaki G., Hippocampal sharp wave-ripple: A cognitive biomarker for episodic memory and planning. Hippocampus 25, 1073–1188 (2015). - PMC - PubMed
    1. Joo H. R., Frank L. M., The hippocampal sharp wave-ripple in memory retrieval for immediate use and consolidation. Nat. Rev. Neurosci. 19, 744–757 (2018). - PMC - PubMed
    1. Foster D. J., Replay comes of age. Annu. Rev. Neurosci. 40, 581–602 (2017). - PubMed

Publication types

LinkOut - more resources