Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 May 18;90(4):893-903.
doi: 10.1016/j.neuron.2016.03.037.

Neural Mechanisms of Hierarchical Planning in a Virtual Subway Network

Affiliations

Neural Mechanisms of Hierarchical Planning in a Virtual Subway Network

Jan Balaguer et al. Neuron. .

Abstract

Planning allows actions to be structured in pursuit of a future goal. However, in natural environments, planning over multiple possible future states incurs prohibitive computational costs. To represent plans efficiently, states can be clustered hierarchically into "contexts". For example, representing a journey through a subway network as a succession of individual states (stations) is more costly than encoding a sequence of contexts (lines) and context switches (line changes). Here, using functional brain imaging, we asked humans to perform a planning task in a virtual subway network. Behavioral analyses revealed that humans executed a hierarchically organized plan. Brain activity in the dorsomedial prefrontal cortex and premotor cortex scaled with the cost of hierarchical plan representation and unique neural signals in these regions signaled contexts and context switches. These results suggest that humans represent hierarchical plans using a network of caudal prefrontal structures. VIDEO ABSTRACT.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Task and Design (A) Schematic representation of planning under a flat (left) and hierarchical (right) policy. Each node from left (start state, shown by the robot) to right shows a possible state (i.e., station) that could be visited. The flag indicates the destination station. A hierarchical policy allows the agent to “chunk” the maze into contexts (here, a red line and a blue line). This in turn reduces the cost of planning and plan representation. (B) The subway map that participants navigated. The map was rotated and the line colors and station names were shuffled between participants. Participants only saw the map during training. (C) A schematic depiction of the sequence of events (trials) that occurred on an example journey. The names at the top and bottom of the screen refer to the current and destination stations, respectively. The responses (arrows) and lines (colored dots) were not shown to participants. Timings (in seconds) for the various events are shown below. (D) Examples of how the various distances were calculated for an example map: DS (stations to goal), DL (lines to goal), DX (exchange stations to goal), and DU (U-turn cost). The numbers and blue-red colormap show the distance in each metric that was used to estimate the cost of planning. The robot shows the start point, and the flag shows the destination station.
Figure 2
Figure 2
Behavioral and Neural Costs of Flat and Hierarchical Planning (A) Regression coefficients (mean ± SEM across participants) showing the slope of the predictive relationship between experimental variables (including distance estimates) and log RTs. (B) Parametric responses (mean ± SEM) to DS and DL in the PMC and rlPFC. There is a significant condition × region interaction. The rlPFC ROI is shown on the right. (C) Encoding of the four plan complexity measures (GLM1) in the lateral (coronal view; upper) and medial (sagittal view; lower) frontal cortices, rendered onto a template brain, thresholded at p < 0.001 uncorrected. (D) Correlation with proximity to goal (GLM1) in the vmPFC. (E) Correlation with proximity to goal (GLM2) in the hippocampus. The activations are shown that exceed p < 0.001, uncorrected. (F) Correlation between parameter estimates linking log(RT) to plan complexity in units of station (left) and lines (right), with beta values encoding the corresponding distance measure in the PMC (upper) and dmPFC (lower). The dots correspond to individual subjects. The lines are to best linear fits for significant (red) and non-significant (gray) correlations, respectively. The significant regions within a circle survived multiple comparisons correction.
Figure 3
Figure 3
BOLD Responses to Bottleneck States (A) BOLD signal β values (mean ± SEM) from single-trial GLM approach in the PMC on three regular stations preceding (leftmost points) and following (rightmost points) a context switch (green lines), an exchange station without line change (purple lines), or an elbow station (cyan lines). The activation at the context switch, exchange station, or elbow are shown with a single point in the corresponding color. The averaged BOLD signal β in regular stations is represented by the horizontal dashed line. (B) Voxels responding to the main effect of station type (exchange > regular) in the PMC (left) and dmPFC (right). (C) Voxels in the amygdala responding to the interaction between station type and response. (D) Voxels in the parietal cortex responding to the main effect of response switch. The coordinates in MNI space are provided under each slice. The significant regions within a circle survived multiple comparisons correction.
Figure 4
Figure 4
Encoding of Context in Multivariate BOLD Signals (A) A depiction of the predicted representational dissimilarity matrix that was used to identify brain regions where the similarity structure was greater within than between contexts. The blue (and yellow) squares represent low (high) dissimilarity, respectively for independent pairs of scanner runs and lines (x and y axis). (B) The results of the RSA identifying voxels encoding context, i.e., where multivoxel pattern dissimilarity was greater between than within contexts (lines), identified using a searchlight approach. (C) Voxels where the pattern encoding the parametric distance to goal (in units of station) was more different between than within contexts (lines). (D) The results of the control analysis for (B) involving shuffled stations-line assignments. An additional control analysis was performed to assert that the effect was not driven by line orientation (see Figure S2). The significant regions within a circle survived multiple comparisons correction.

References

    1. Badre D., Kayser A.S., D’Esposito M. Frontal cortex and the discovery of abstract action rules. Neuron. 2010;66:315–326. - PMC - PubMed
    1. Botvinick M., Nystrom L.E., Fissell K., Carter C.S., Cohen J.D. Conflict monitoring versus selection-for-action in anterior cingulate cortex. Nature. 1999;402:179–181. - PubMed
    1. Botvinick M.M., Niv Y., Barto A.C. Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition. 2009;113:262–280. - PMC - PubMed
    1. Daw N.D., Niv Y., Dayan P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 2005;8:1704–1711. - PubMed
    1. Daw N.D., Gershman S.J., Seymour B., Dayan P., Dolan R.J. Model-based influences on humans’ choices and striatal prediction errors. Neuron. 2011;69:1204–1215. - PMC - PubMed

Publication types

LinkOut - more resources