Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Apr 20:9:e54519.
doi: 10.7554/eLife.54519.

Behavioral evidence for memory replay of video episodes in the macaque

Affiliations

Behavioral evidence for memory replay of video episodes in the macaque

Shuzhen Zuo et al. Elife. .

Abstract

Humans recall the past by replaying fragments of events temporally. Here, we demonstrate a similar effect in macaques. We trained six rhesus monkeys with a temporal-order judgement (TOJ) task and collected 5000 TOJ trials. In each trial, the monkeys watched a naturalistic video of about 10 s comprising two across-context clips, and after a 2 s delay, performed TOJ between two frames from the video. The data are suggestive of a non-linear, time-compressed forward memory replay mechanism in the macaque. In contrast with humans, such compression of replay is, however, not sophisticated enough to allow these monkeys to skip over irrelevant information by compressing the encoded video globally. We also reveal that the monkeys detect event contextual boundaries, and that such detection facilitates recall by increasing the rate of information accumulation. Demonstration of a time-compressed, forward replay-like pattern in the macaque provides insights into the evolution of episodic memory in our lineage.

Keywords: drift diffusion model framework; event boundary detection; forward replay-like pattern; human; naturalistic material; neuroscience; rhesus macaque; temporal order judgement; time compression of memory traces.

PubMed Disclaimer

Conflict of interest statement

SZ, LW, JS, YC, BZ, SL, KA, YZ, SK No competing interests declared

Figures

Figure 1.
Figure 1.. TOJ task schema and RT results.
(A) In each trial, the monkey watched a video (8–12 s, comprising two 4–6 s video clips), and following a 2 s retention delay, made temporal order judgement between two probe frames extracted from the video. The monkeys were required to choose the frame that was presented earlier in the video for water reward. (B) Task performance of six monkeys. Proportion correct for the six monkeys (left); mean reaction times for three trial types (right). Error bars are standard errors of the means over monkeys. *** denotes p<0.001. (C) Linear plots of reaction time (RT) for each monkey as a function of chosen frame location, see also Table 1. (D) Linear plots of RT as a function of chosen frame location for each human participant, see also Supplementary file 4. In panels (C) and (D), black lines and orange lines refer to lists of non-primate video clips and primate video clips, respectively (with five repetitions collapsed for monkeys and two repetitions collapsed for human participants). All responses in the within-context condition are shown, with cyan and magenta dots denoting whether the chosen probe frames were extracted from Clip 1 or Clip 2, respectively.
Figure 1—figure supplement 1.
Figure 1—figure supplement 1.. Performance of human participants and speed accuracy trade off results.
(A) (Left) Task performance of seven human participants. The proportions of correct responses for the across-context condition are significantly higher than those for the within-context condition (all p<0.001). (Right) Mean reaction times for the three trial types differ from each other, across- vs. within-Clip1 vs. within-Clip2: F (2, 18) = 18.65, p<0.001, and RT is significantly faster in the across-context condition than in the within-context conditions (Clip 1 and Clip 2). Error bars are standard errors of the means over participants. *** denotes p<0.001. (B) Speed accuracy trade-off analysis. The monkeys showed a mild numeral increase trend in inverse efficiency score across the four segments but this did not reach statistical significance (left); the humans show a lower inverse efficiency score for the video parts right after a boundary (F (3, 24) = 4.17, p=0.016), with post hoc tests showing significant difference between bars 2 and 3 (right).
Figure 2.
Figure 2.. Moving average analysis based on reciprocal latency and accuracy for monkeys (left panel) and human participants (right panel).
(A) Reciprocal latency for monkeys as a function of chosen frame location for the average of all animals (upper panel) in the within-context condition, with the results for six individual monkeys are shown in the lower panel. The relationship between chosen frame location and RT follows a non-linear pattern. (B) Reciprocal latency for human participants as a function of chosen frame location for the average of all human subjects (upper panel) in the within-context condition, with results for individual subjects are shown in the lower panel. In panels (A) and (B), the shaded region denotes confidence intervals. (C) Proportion of correct answers for individual monkeys as a function of target frame location in the within-context condition. (D) Proportion of correct answers for individual human subjects as a function of target frame location in the within-context condition. In panels (C) and (D), the horizontal blue lines denote chance-level accuracy. Blue vertical lines in these plots denote the mean boundary location between Clip 1 and Clip 2 (116th frame).
Figure 2—figure supplement 1.
Figure 2—figure supplement 1.. Relationship between temporal similarity and reciprocal latency for within-context trials in (A) monkeys and (B) humans.
Reciprocal latency as a function of temporal similarity for the average of all individuals (upper panel) and for each individual (bottom panel). The temporal similarity between two frames is mathematically the inverse of their respective frame locations in the video.
Figure 3.
Figure 3.. Model comparison using representative similarity analysis.
(A) Visualization of two candidate models as representational dissimilarity matrices (RDMs). Patterns of reaction time (rank-transformed values) as a function of chosen frame location for the the two hypothetical models. The colors of Clip 1 and Clip 2 evolve increasingly with the temporal progression of the video (left), and their respective hypothetical RDMs (right). The reduction in RT (indicated by an arrow) between Clip 1 and Clip 2 is defined as ‘offset’; the magnitude of such an ‘offset’ is arbitrary (but see further analysis in Figure 4). (B) We segmented the videos into eight equal segments, and the RDMs show pairwise Euclidean distances between these different segments for the species group average (monkeys: left; humans: right) and for each individual separately (monkeys: left bottom; humans: right bottom). RDM correlation tests between behavioral RDMs and two candidate RDMs show that the monkeys replay the footage using a Strict forward strategy (r = 0.66, p=0.009), and provide little evidence for the Global compression strategy (r = −0.16, p=0.802). Humans show an opposite pattern from the macaques. In the humans, the Global compression model shows a higher correlation with behavioral RDM (marginally insignificant r = 0.39, p=0.069) than with the Strict forward model (r = −0.11, p=0.703). Pairwise comparisons show that the two models are both statistically significant for monkeys (p<0.01) and humans (p<0.05). Error bars indicate the standard errors of the means based on 100 iterations of randomization. P values are FDR-corrected (***, p<0.001, **, p<0.01, *, p<0.05).
Figure 4.
Figure 4.. The Strict forward model provides a better fit to the RT data in monkeys but not in humans.
(A) ‘Offsets’ are defined as the magnitude of reduced RT when the frames were in Clip 2. 11 hypothetical models with their reaction time patterns (top) and RDMs (bottom). We systemically varied the ‘offset’ parameter while keeping a constant slope. These models progressively range from an absolute Global compression model (model 1, most left) to a Strict-forward model (model 6, middle), and beyond (7th to 11th models, right). The numerals below the RDMs denote the magnitude of the respective offsets. (B) Each monkey’s data were tested against each of these 11 hypothetical models. The Spearman correlations increase as a function of offset magnitude between Clip 1 and Clip 2 until reaching an asymptote when the offset value is around zero, which corresponds to the Strict forward model (model 6 in panel (A); see also Figure 3). Individuals’ RT RDMs are shown in insets. (C) Each human participant’s data were also tested against each of these 11 hypothetical models. The Spearman correlations decrease as a function of offset magnitude between Clip 1 and Clip 2 until reaching an asymptote when the offset value is around zero. This analysis confirms the hypothetical discrepancy between the two species (see also Figure 3B).
Figure 5.
Figure 5.. LATER model fitting of RT in across-context and within-context conditions for both species.
(A) Cartoon of the LATER model cartoon illustrating that a decision signal triggered by a stimulus rises from its start level, at a rate of information accumulation r, to the threshold. Once it reaches the threshold, a decision is initiated. The rate of rise r varies from trial to trial, obeying the Gaussian distribution (variation denoted as green shaded area). (B) Contextual change effect on the distribution of response latency for the monkeys; data from Monkey ‘Mars’ was chosen for larger display. (C) Contextual change effect on the distribution of response latency for humans; data from Subject 1 was chosen for larger display. The red and blue dashed lines show the best fits (maximum likelihood) of across-context trials and within-context trials, respectively (see also Supplementary file 3).
Figure 6.
Figure 6.. Full GLM analysis including a number of variables that might affect reciprocal latency separately for within-context and across-context conditions.
(A) Monkey data. (B) Human data. We included ten regressors, namely, a binary regressor indicating whether the video category is primate or non-primate (video category), a binary regressor indicating that a video is played forward or backward (play order), the repeated exposure of the trial (Monkey: 1–5; Human: 1–2) (exposure), the physical location of the selected probe on screen (left or right) (touch side), time elapsed within a session (elapsed time; to rule out fatigue or attentional confounds), chosen frame location, temporal similarity, SURF similarity as a perceptual similarity measure (perceptual similarity), temporal distance between two probe frames, and response of subjects (correct/incorrect). In the monkeys, the results confirm that chosen frame location is the most significant regressor in within-context trials, whereas perceptual similarity is the most significant regressor in across-context trials. ***, p<0.001, **, p<0.01, *, p<0.05.
Figure 6—figure supplement 1.
Figure 6—figure supplement 1.. GLM results on the effects of image similarity measures on reciprocal latency for within-context and across-context conditions for (A) the monkeys and (B) humans, and (C) an example of SURF similarity.
Among the several indices (difference of distribution in RGB-histogram, Histogram of Oriented Gradients (HOG) similarity and SURF similarity), the SURF similarity measure was significantly correlated with reciprocal latency in the across-context condition in both species (monkeys: p=0.0015; humans: p<0.001). Higher perceptual dissimilarity leads to shorter RT. In panel (C), SURF uses various scales and orientations to identify unique features or key-points in an image. The cartoon illustrates how features from two images can be matched irrespective of their scale. Thus, if the same feature exits in another image that is smaller/larger in size or even at different orientations, SURF can still identify that feature (or key-point) as corresponding or similar in both images. We can see that features of the inset are matched to the T-shirt on the basis of how strongly they are related, but some feature points (mostly a minority) may still incorrectly match other parts of the image.
Figure 7.
Figure 7.. Sessional accuracy data expressed as proportion correct for each individual.
(A) Monkey data. (B) Human data. No obvious increase in performance was observed over the course of testing days in the experiment for either monkeys or humans.

References

    1. Bay H, Ess A, Tuytelaars T, Van Gool L. Speeded-Up robust features (SURF) Computer Vision and Image Understanding. 2008;110:346–359. doi: 10.1016/j.cviu.2007.09.014. - DOI
    1. Behrens TEJ, Muller TH, Whittington JCR, Mark S, Baram AB, Stachenfeld KL, Kurth-Nelson Z. What is a cognitive map? organizing knowledge for flexible behavior. Neuron. 2018;100:490–509. doi: 10.1016/j.neuron.2018.10.002. - DOI - PubMed
    1. Bonasia K, Blommesteyn J, Moscovitch M. Memory and navigation: compression of space varies with route length and turns. Hippocampus. 2016;26:9–12. doi: 10.1002/hipo.22539. - DOI - PubMed
    1. Brown GD, Neath I, Chater N. A temporal ratio model of memory. Psychological Review. 2007;114:539–576. doi: 10.1037/0033-295X.114.3.539. - DOI - PubMed
    1. Charles DP, Gaffan D, Buckley MJ. Impaired recency judgments and intact novelty judgments after fornix transection in monkeys. Journal of Neuroscience. 2004;24:2037–2044. doi: 10.1523/JNEUROSCI.3796-03.2004. - DOI - PMC - PubMed

Publication types