Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul 11;99(1):179-193.e7.
doi: 10.1016/j.neuron.2018.06.008. Epub 2018 Jun 28.

A Dedicated Population for Reward Coding in the Hippocampus

Affiliations

A Dedicated Population for Reward Coding in the Hippocampus

Jeffrey L Gauthier et al. Neuron. .

Abstract

The hippocampus plays a critical role in goal-directed navigation. Across different environments, however, hippocampal maps are randomized, making it unclear how goal locations could be encoded consistently. To address this question, we developed a virtual reality task with shifting reward contingencies to distinguish place versus reward encoding. In mice performing the task, large-scale recordings in CA1 and subiculum revealed a small, specialized cell population that was only active near reward yet whose activity could not be explained by sensory cues or stereotyped reward anticipation behavior. Across different virtual environments, most cells remapped randomly, but reward encoding consistently arose from a single pool of cells, suggesting that they formed a dedicated channel for reward. These observations represent a significant departure from the current understanding of CA1 as a relatively homogeneous ensemble without fixed coding properties and provide a new candidate for the cellular basis of goal memory in the hippocampus.

Keywords: CA1; hippocampus; navigation; place cells; place fields; reward; subiculum; virtual reality.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests

The authors declare no competing interests.

Figures

Figure 1:
Figure 1:. A distinct population of hippocampal neurons are consistently active near reward.
(A) Typical fields of view in CA1 and subiculum of neurons expressing GCaMP3. Image widths are 200 um. (B) Schematic of the virtual linear track and reward delivery location. (C) COM locations of all cells with a spatial field during condition Aend (9,761 cells, 11 mice). Black line shows observed density, gray patches show density of a fitted mixture distribution consisting of a uniform distribution (light gray) and a Gaussian distribution (dark gray, mean 355 cm, s.d. 25 cm). (D) Schematic of condition in which reward delivery shifted between two locations. (E) Activity of six simultaneously-recorded CA1 neurons during the first three blocks of one session of condition AendAmid. Each column shows the spatially-averaged activity of one cell in the first (top), second (middle), and third (bottom) blocks. Activity on each traversal was spatially binned (width 10 cm), filtered (Gaussian kernel, radius 10 cm) and averaged (70th percentile) across all traversals, excepting the first three traversals of each block. Black arrowheads indicate COM location computed by pooling trials from all blocks of a single context (Aend or Amid, see Methods). Red lines indicate reward location in each block. (F) Top: track diagram. Bottom: COM locations during Amid of cells with a spatially-modulated field located within 25 cm of reward during Aend (square bracket beneath track diagram, 1,171 cells, 6 mice). Red lines indicate reward location, colored bands indicate clusters of reward-associated cells (purple) or cells whose field remained in the same location (blue). Similar results were obtained when considering CA1 and subiculum separately (Figure S1C). (G) The COM locations of all cells with spatial fields during both Aend and Amid (3,842 cells, 6 mice). Red lines indicate reward location. Arrows indicate regions defining reward-associated cells (purple) and place cells with stable field locations (blue). Colored markers indicate the COM locations of the examples in panel E. Similar results were obtained when considering CA1 and subiculum separately (Figure S1D).
Figure 2:
Figure 2:. Reward-associated cell identity persists across contexts.
(A) Schematic of condition in which mice were teleported between two different virtual linear tracks. (B) Activity of six simultaneously-recorded CA1 neurons during the first three blocks of one session of condition AB. Same conventions and spatial-averaging procedure as in Figure 1e, except that all traversals were included. (C) COM locations on track B for two populations of cells. Upper histogram: cells with a spatial field on track A located between 25 cm after track start to 25 cm before reward (wide square bracket). Lower histogram: cells with a spatial field on track A located in the 25 cm preceding reward (narrow square bracket). (D) COM locations of all cells with a spatial field on both track A and track B (2,168 cells, 5 mice), red lines indicate reward locations. Gray line indicates proportionally equivalent locations on the two tracks. Colored markers indicate COM locations of panel b examples. (E) Schematic of COM density under two hypotheses for how spatial fields remapped (see text). (F) Observed density of COMs on track A and track B (same data as in d), spatially-binned (width 12.5 cm) and smoothed (2D Gaussian kernel, radius 20 cm). Due to circularity of the track, increased density is present in all four corners. (G) Schematic summarizing the observed remapping. Among both place cells and reward-associated cells, cell identities were fixed, though spatial fields were formed by a different subset of cells in each environment. Place cells (blue) covered the track uniformly, while reward-associated fields (purple) were only located near reward.
Figure 3:
Figure 3:. Reward-predictive cell activity is correlated with anticipation of reward.
(A) Representative example of reward approach behaviors. (B) Movement speeds during one session with slowing threshold (dashed line). (C) Spatially-binned speed for single trials (gray lines) and averaged across trials (black lines). Red lines indicate reward. (D) Left: spatially-binned speed for the first three blocks of one session of condition AendAmid, first block is top panel, same conventions as in b. The first three traversals of each block are omitted. Right: Running speed on the first fifty traversals. (E) Reward approach behavior on six trials from the session depicted in (C) comparing speed (gray), slowing onset (black), and activity of one reward-predictive cell in CA1 (purple). (F) Speed (gray) and activity (purple) on all traversals in which slowing onset (black lines) occurred within 60 cm before the reward location (red line) for same cell as in (E). Each pixel shows average in a 2 cm spatial bin. Black tick marks show example trials plotted in (E). (G) Activity of a simultaneously-recorded place cell, same conventions as in (E), except activity is shown in blue. (H) Statistical test to evaluate correlation between activity and speed for the cells depicted in (F) and (G). (I) Left: COM locations of cells with spatial fields in both Aend and Amid (gray points, same data as Figure 1g). Highlighted cells (maroon circles, 116 cells) were slowing-correlated during Aend (see definition in text). Right: Lower bound of estimated density of slowing-correlated cells, binned by COM location and spatially smoothed (see Methods). Dashed lines indicate approximate boundaries of reward-predictive cells (purple) and stable place cells (blue). (J) Upper: average activity of 198 slowing-correlated cells (6 mice, maroon trace) during Aend blocks, and all spatially-modulated cells recorded simultaneously (7,343 cells, gray trace). Red line indicates Aend reward location. Lower: activity of same cells during the interleaved Amid blocks. Red lines indicate reward location for Amid (solid) and Aend (dashed). Bands indicate standard error of the mean. For arrowheads, see text.
Figure 4:
Figure 4:. Place cells and reward-predictive cells were active simultaneously.
(A) For each of 6 sessions (columns) from 4 mice, 10 representative pre-reward walking bouts (rows) are shown. For each bout, colored traces show activity averaged across all reward-predictive cells (purple) or reward-adjacent place cells (blue). Bouts are sorted according to the fraction of total activity that arose from place cells (most to least). Activity was averaged in 0.3 second bins, and is shown beginning one second prior to the onset of slowing (black vertical line) until reward delivery (red vertical line), or at most 5 seconds. On many bouts, reward cells and place cells were active simultaneously. (B) Example illustrating how the activity of each bout is summarized in the population analysis of panel (C). For a single bout (left panel, same conventions as in (A)), activity is plotted as a scatter (right panel) comparing place cells (horizontal) to reward-predictive cells (vertical). (C) Two-dimensional histogram summarizing activity from all pre-reward walking bouts in one session. Place and reward-predictive cells were frequently active simultaneously, and their activity was significantly correlated. (D) Two-dimensional histogram summarizing activity from all pre-reward walking bouts during condition AendAmid, same conventions as in (C). To enhance readability, tails of the distribution (0.3% of time points) are not shown. The activity of place cells and reward-predictive cells was significantly correlated, indicating a tendency for the two populations to be active stimultaneously. (E) Control to ensure that the positive correlation was not due to place cells and reward-predictive cells having a similar time course. When activity was shuffled across all sessions (upper panel), the distribution of correlations (black histogram) was lower than the observed value (black vertical line). This was also true when activity was shuffled only within each session (lower panel). (F) Control to ensure that the correlation was not due to the residual fluorescence time course following cessation of activity. For each cell, the original time course was binarized by zeroing all time points following the initial rise in each transient, and setting the amplitude of all non-zero points to 1 (upper panel, see Methods). After using these binarized time courses to perform the same analysis as in the bottom of panel (E), there was still a significant correlation between the activity of reward-predictive cells and place cells (lower panel). These results show that reward-adjacent place cells and reward-predictive cells were not anti-correlated, and in fact the two populations tended to be active simultaneously more often than expected by chance.
Figure 5:
Figure 5:. Reward-predictive cells formed a consistent sequence that began prior to reward anticipation behavior.
(A) Mean activity of 22 simultaneously-recorded reward-predictive cells from CA1 shown in the same order for Aend (left) and Amid (right). Small black lines indicate time of peak activity. Cells were selected for having COM locations within 50 cm before reward and being active on at least 20 trials in the 100 cm before reward. Time courses were filtered with a Gaussian kernel (s.d. 0.1 sec). (B) Time of peak activity relative to slowing for the same cells as in (A). Bars indicate width at half max of unfiltered trace. (C) Time of peak activity relative to slowing for all reward-associated cells recorded during condition AendAmid (218 cells, 6 mice) and condition AB (243 cells, 5 mice). Bars indicate width at half max of unfiltered trace. (D) Five reward-predictive cells active early in their respective sequences. In each case, fluorescence increased 1-2 seconds before speed decreased. Cells were recorded in four mice, each column a cell. Cells in column 2 and 3 were recorded simultaneously. The cell in column 4 was from subiculum, and others were from CA1. Top: speed and activity on single traversals, same conventions as in Figure 3f. Activity on each trial was normalized to have a maximum of 1. Red lines indicate the time of reward delivery when it occurred early enough to be within plot bounds. Bottom: average across trials of activity (80th percentile) and speed (mean).
Figure 6:
Figure 6:. Reward-predictive cell activity can not be explained by reward anticipation behavior.
(A-B) Activity is shown for the same 22 simultaneously-recorded reward-predictive cells depicted in Figure 5a. (A) Instantaneous movement speed (top), activity of each reward-predictive cell plotted in the same order as in Figure 5a (middle), and population mean activity (bottom) during two walking bouts, preceding the current (left) or non-current reward location (right). Black lines indicate onset of walking, red lines indicate reward, and gray lines indicate the end of the unrewarded walking bout. (B) Top: Movement speed averaged over all walking bouts from this session, excluding the first three traversals of each block, grouped by whether they preceded current (pink) or non-current (blue-green) reward. Bottom: Simultaneous activity of reward-predictive cells. Single trial traces were averaged in half-second chunks before combining across trials, bands show standard error of the mean across trials. (C) Average speed relative to slowing threshold (top panels) and average activity of reward-predictive cells (bottom). Only includes sessions in which reward-predictive cells were recorded. A subset of bouts was manually chosen (see Methods) to maximize the similarity of average speed; for all bouts, see Figure S6. Condition AendAmid only includes data from day 7 of training or later to ensure mice were familiar with the reward delivery paradigm. (D) Comparison of how quickly slowing behavior and reward-predictive cell activity adapt to a new context. Condition AB only includes data from track A. Top: fraction of traversals in which mice exhibited a pre-reward walking bout (pink) or an unrewarded walking spanning the non-current reward location (blue-green). Error bars indicate 95% confidence interval. Bottom: mean fluorescence of reward-predictive cells in the first 5 seconds after slowing onset, error bars show standard error of the mean.

Comment in

References

    1. Ambrose RE, Pfeiffer BE, & Foster DJ (2016). Reverse Replay of Hippocampal Place Cells Is Uniquely Modulated by Changing Reward. Neuron, 91, 1124–1136. - PMC - PubMed
    1. Andersen P, Morris R, Amaral DG, Bliss T, & O’Keefe J (2007). The Hippocampus Book. (Oxford University Press, USA: ).
    1. Anderson MI & Jeffery KJ (2003). Heterogeneous Modulation of Place Cell Firing by Changes in Context. J Neurosci, 23, 8827–8835. - PMC - PubMed
    1. Aronov D, Nevers R, & Tank DW (2017). Mapping of a non-spatial dimension by the hippocampal-entorhinal circuit. Nature, 543, 719–722. - PMC - PubMed
    1. Aronov D & Tank DW (2014). Engagement of neural circuits underlying 2D spatial navigation in a rodent virtual reality system. Neuron, 84, 442–456. - PMC - PubMed

Publication types

LinkOut - more resources