Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Feb 26:2024.02.21.581495.
doi: 10.1101/2024.02.21.581495.

Monkeys engage in visual simulation to solve complex problems

Affiliations

Monkeys engage in visual simulation to solve complex problems

Aarit Ahuja et al. bioRxiv. .

Update in

Abstract

Visual simulation - i.e., using internal reconstructions of the world to experience potential future versions of events that are not currently happening - is among the most sophisticated capacities of the human mind. But is this ability in fact uniquely human? To answer this question, we tested monkeys on a series of experiments involving the 'Planko' game, which we have previously used to evoke visual simulation in human participants. We found that monkeys were able to successfully play the game using a simulation strategy, predicting the trajectory of a ball through a field of planks while demonstrating a level of accuracy and behavioral signatures comparable to humans. Computational analyses further revealed that the monkeys' strategy while playing Planko aligned with a recurrent neural network (RNN) that approached the task using a spontaneously learned simulation strategy. Finally, we carried out awake functional magnetic resonance imaging while monkeys played Planko. We found activity in motion-sensitive regions of the monkey brain during hypothesized simulation periods, even without any perceived visual motion cues. This neural result closely mirrors previous findings from human research, suggesting a shared mechanism of visual simulation across species. In all, these findings challenge traditional views of animal cognition, proposing that nonhuman primates possess a complex cognitive landscape, capable of invoking imaginative and predictive mental experiences to solve complex everyday problems.

PubMed Disclaimer

Conflict of interest statement

Competing Interests The authors declare no competing interests.

Figures

Figure 1:
Figure 1:
A - Examples of Planko boards used in the task. Monkeys were required to predict which catcher the ball would land in, if dropped. In the present example, the three boards in the left column lead to the left catcher, and the three boards in the right column lead to the right catcher. B - A schematic of one complete trial, including the pre-response period when the monkeys could potentially simulate the ball’s trajectory and the post-response period when they saw the ball fall. C - A diagram of the NHP upright rig setup that was used for training and behavioral testing on the task. Monkeys indicated their responses using one of the two provided buttons, and were given juice reward for correct responses.
Figure 2:
Figure 2:
A - Examples of the “shadow ball” that was used to train the monkeys on Planko. Shadow ball trials were always intermixed with non-shadow ball trials. Over the course of training, the shadow ball gradually faded away until it revealed none of the ball’s trajectory, in the hopes that monkeys would continue to extrapolate the ball’s trajectory, even when not given. B - Examples of various onscreen plank counts that were used during training. Both monkeys began with one plank boards and were gradually progressed along until they were able to navigate ten onscreen planks. C - The progression of each monkey’s task accuracy across multiple training sessions in which onscreen planks were progressively increased (non-shadow ball trials only). Both monkeys initially struggled with increasing plank numbers, before arriving at a generalizable strategy that allowed them to maintain consistent task accuracy. D - An example of a board where slightly jittering the position of each plank (three jittered examples j2,3,4 shown with the original j1 underlaid) had a minimal impact on the ball’s final position. Outcome changes were assigned a penalty (in this example, only penalties of 0 were assigned), and used to calculate a simulation uncertainty score. Boards like this one were classified as having a low simulation uncertainty. E - An example of a board where slightly jittering the position of each plank (three jittered examples j2,3,4 shown with the original j1 underlaid) had a significant impact on the ball’s final position. Outcome changes were assigned a penalty between 0 and 1 and used to calculate a simulation uncertainty score. Such boards were classified as having a high simulation uncertainty. F - A histogram of all the simulation uncertainty scores assigned to boards from the two monkeys’ task test days. G - Task accuracy for Monkey G and Monkey A as a function of simulation uncertainty. Both monkeys were affected by this metric, suggesting that they might be using a simulation strategy. H - A schematic depicting the analysis of eye movement overlap between pre-response and post-response trial periods. I - Eye movement spatial overlap for Monkey G and Monkey A, relative to a shuffled chance. Both monkeys showed a higher than chance degree of overlap between pre and post response eye movements, consistent with a simulation strategy. J - Data from G and I compared to past findings from human subjects. Both monkeys showed behavioral and oculomotor trends that are in line with what we have previously observed in humans (see text for details).
Figure 3:
Figure 3:
A - Examples of two types of networks, a feedforward convolutional neural network (CNN) and a feedback recurrent neural network (RNN) that were trained to solve the Planko task. B - Each network’s task accuracy when tested on the same board sets from the monkeys’ task test days. Like the monkeys (MG and MA), both networks achieved above chance accuracy. C - A heat map showing the average activity of the hidden units on an example board for both the CNN and the RNN. The second row shows the same activity again, but with the input board image overlaid. D - A schematic depicting how we quantified whether the ball’s trajectory was represented in the network hidden layer activity. E - Average RMSE values for each predicted vs actual position for the CNN and RNN trained decoders, relative to the board image trained (null/chance) model. While the CNN trained decoders almost never achieved greater than chance prediction accuracy, the RNN trained decoders consistently predicted the position of the ball with a high degree of accuracy. F - Network uncertainty for the CNN and the RNN as a function of whether each monkey gave the correct or incorrect response on a given board. The CNN’s average network uncertainty was no different for boards that the monkeys got correct vs boards that they got incorrect, whereas the RNN’s average network uncertainty was significantly higher on boards that the monkeys got incorrect compared to boards they got correct.
Figure 4:
Figure 4:
A - A diagram of the NHP fMRI setup. Monkeys were seated in the “sphinx” position and placed inside the scanner, where they viewed a screen at the end of the bore and indicated their response using MRI compatible button boxes. B - A schematic of the motion localizer task we used to isolate motion sensitive ROIs. C - A schematic of the three variants of the main Planko task that monkeys were trained to perform inside the scanner. D - An example of one complete scanning session, containing motion localizer blocks at the beginning and end, and several blocks of each task variant randomly interspersed throughout (grey regions indicate interblock intervals).
Figure 5:
Figure 5:
A - The result of the Motion > Flicker contrast from the motion localizer task. We observed activity in canonically motion-sensitive brain areas, such as MT, MST, V4d, and 45b. B - The result of the Perception > Control contrast from the Planko task variants. Once again, we observed activity in many of the same motion-sensitive areas, such as MT, MST, and V4d. C - The result of the Simulation > Control contrast from the Planko task variants. Here too, we observed striking activity in motion-sensitive areas such as MT, MST, and V4d. D - A depiction of the motion-sensitive ROI that were used for subsequent representational similarity analyses. All voxels that survived cluster correction at a p < 0.05 FWE threshold were selected. E - A schematic showing our main comparisons of interest. We compared the pattern of activity (relative to baseline) in the Simulation condition to both the Perception condition (S-P) and the Control condition (S-C). G (Left) - A comparison of S-P and S-C representational similarities. As with the human participants in our previous study (Ahuja et al., 2022), Monkey G’s data also showed an elevated voxel-wise pattern similarity for the S-P comparison relative to the S-C comparison. (Right) S-P — S-C similarity for human participants and Monkey G.

References

    1. Springer A., Parkinson J. & Prinz W. Action simulation: time course and representational mechanisms. Frontiers in Psychology 4, 387 (2013). - PMC - PubMed
    1. Battaglia P. W., Hamrick J. B. & Tenenbaum J. B. Simulation as an engine of physical scene understanding. Proceedings of the National Academy of Sciences 110, 18327–18332 (2013). - PMC - PubMed
    1. Ullman T. D., Spelke E., Battaglia P. & Tenenbaum J. B. Mind Games: Game Engines as an Architecture for Intuitive Physics. Trends in cognitive sciences 21, 649–665 (2017). - PubMed
    1. Ahuja A., Desrochers T. M. & Sheinberg D. L. A role for visual areas in physics simulations. Cognitive Neuropsych 1–15 (2022) doi:10.1080/02643294.2022.2034609. - DOI - PMC - PubMed
    1. Ahuja A. & Sheinberg D. L. Behavioral and oculomotor evidence for visual simulation of object movement. Journal of Vision 19, 13–13 (2019). - PMC - PubMed

Publication types