Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 29;14(3):431.
doi: 10.3390/ani14030431.

The Hippocampus in Pigeons Contributes to the Model-Based Valuation and the Relationship between Temporal Context States

Affiliations

The Hippocampus in Pigeons Contributes to the Model-Based Valuation and the Relationship between Temporal Context States

Lifang Yang et al. Animals (Basel). .

Abstract

Model-based decision-making guides organism behavior by the representation of the relationships between different states. Previous studies have shown that the mammalian hippocampus (Hp) plays a key role in learning the structure of relationships among experiences. However, the hippocampal neural mechanisms of birds for model-based learning have rarely been reported. Here, we trained six pigeons to perform a two-step task and explore whether their Hp contributes to model-based learning. Behavioral performance and hippocampal multi-channel local field potentials (LFPs) were recorded during the task. We estimated the subjective values using a reinforcement learning model dynamically fitted to the pigeon's choice of behavior. The results show that the model-based learner can capture the behavioral choices of pigeons well throughout the learning process. Neural analysis indicated that high-frequency (12-100 Hz) power in Hp represented the temporal context states. Moreover, dynamic correlation and decoding results provided further support for the high-frequency dependence of model-based valuations. In addition, we observed a significant increase in hippocampal neural similarity at the low-frequency band (1-12 Hz) for common temporal context states after learning. Overall, our findings suggest that pigeons use model-based inferences to learn multi-step tasks, and multiple LFP frequency bands collaboratively contribute to model-based learning. Specifically, the high-frequency (12-100 Hz) oscillations represent model-based valuations, while the low-frequency (1-12 Hz) neural similarity is influenced by the relationship between temporal context states. These results contribute to our understanding of the neural mechanisms underlying model-based learning and broaden the scope of hippocampal contributions to avian behavior.

Keywords: hippocampus; local field potentials; model-based valuation; pigeon; representation of relationships.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
The apparatus, two-step probabilistic learning task, recording sites, and LFPs analysis. (A) Schematic diagram of the pigeon training apparatus. (B) The two-step task for pigeons is as follows: A 5 s gray screen illuminates to indicate the trial is ready. Step 1 (transition structure) is initiated by the simultaneous presentation of two differently colored markers (S1+ and S1) on both sides of the screen. The pigeons indicate their choice by pecking the key below the corresponding target option. A probabilistic transition takes place, with probability depending on the choice of the pigeons. Following this, a blue triangle (S2+) or a circle (S2) marker appears in the middle of the screen, indicating the outcome of the transition and initiating Step 2 (reward structure). The pigeons then peck the key below the S2 target within 2 s. A 3 s reward is delivered with an appropriate probability. (C) The histological verification of the implantation site. (D) The schematic of the acquired 16-electrode LFP signals, represented in different colors. (E) Time-frequency patterns (1–100 Hz) were extracted from all electrodes. For every temporal epoch during the S1 and S2 presentation, the z-scored power from every electrode l and every frequency f band was combined to create a single feature vector Zl,f for that epoch. Numbers 1–16 represent the electrode’s ID (F) Distributed oscillatory power of S1 (top panel) or S2 (bottom panel). Two feature vectors S1i and S2j composed of all vectors Zl,f, for each S1 presentation temporal epoch i, and each S2 presentation temporal epoch j, were constructed. (G) Neural similarity (cosine similarity rho) for all temporal context-dependent state pairs was shown for a single trial.
Figure 2
Figure 2
Pigeon’s behavior performance, parameter changes in the fitted RL model, option values, and state values estimated by dynamic RL model. (A) Reward rate of all pigeons during the whole learning process, where the reward rate of each session is calculated as the ratio of reward trials to the total number of trials (n = 6, session = 60, mean ± SD). (B) Dynamic correct choice (S1+) rate of all pigeons, with the value representing the ratio of S1+ choice trials to the total number of trials in each session (n = 6, session = 60, mean ± SD). (C) The changing trend of the parameters (γ, β) of the RL model fitted from all the pigeons’ behavior (mean ± SD). (D) The S1+ choice rate computed by one example pigeon (P090), estimated by dynamic RL model, and estimated by static RL model (The model ran 50 rounds, mean ± SD). (E,F) are the trends of the option value QS1 in Step 1 and state value QS2 in Step 2 estimated by RL model (n = 6, mean± SD).
Figure 3
Figure 3
Hippocampal LFPs power distribution of temporal context-dependent states. Panels (A,B) display the normalized power distribution maps within the 12–100 Hz range for the common and uncommon temporal context-dependent states, respectively. Numbers 1–16 in the color map represent the electrode’s ID, and their arrangement corresponds to the relative position of the electrode implantation in the brain. The grayscale map corresponds to the significant differences of all electrodes under different conditions (n = 6, Early stage = 100 trials, Prob (S1+ choice) < 65%; Late stage = 100 trials, Prob (S1+ choice) > 90%). Panels (C,D) present the p-map for all different conditions in the low-frequency band (1–12 Hz), where “ns” indicates no significance. Panels (E,F) depict the dynamic changes in the normalized power in the hippocampus within the 12–100 Hz band, while panels (G,H) detail the dynamic changes within the 1–12 Hz band for the common temporal context states. The solid line in panels (EH) indicates the average of the normalized power in each session (n = 6, session = 60) of every state. The data were smoothed with a five-point moving average. The normalized power of each trial represents the average within a 0.5 s time window before pecking the key in Step 1 or Step 2, and the result of each session is the average of the normalized power of all trials within that session.
Figure 4
Figure 4
The hippocampal LFPs normalized high-frequency power of options in Step 1 and states in Step 2 are dynamically correlated with model-based valuations. (A) The relationship between the normalized power of option S1+ and the option value QS1+ estimated by model-based. (B) The relationship between the normalized power of state S2+ and the state value QS2+ estimated by model-based. (C) The relationship between the normalized power of option S1 and the option value QS1. (D) The relationship between the normalized power of state S2 and the state value QS2 estimated by model-based. The dots’ color fading from yellow to blue represents the passage from early sessions to late sessions.
Figure 5
Figure 5
The neural similarity in the low-frequency band (1–12 Hz) in the Hp is influenced by common temporal context states. The grand average neural similarity map for common (Panel (A)) and uncommon (Panel (B)) trials in the Hp during the early and late learning stages. The time window aligns with the pecking key moment in Step 1, corresponding to 0 s in the panel. Panel (C) represents the difference between Panels (A,B). Panel (D) displays the p-map of the common vs. uncommon contrast in the early learning stage, with no significant clusters identified. In Panel (E), the p-map of the common vs. uncommon contrast reveals a significant cluster at p(corr) < 0.05 (outlined in black and named tROI 1) in the late learning stage, indicating that the neural similarity in the Hp represents the binding of temporal context states information (the relationship between two temporal context states). Panel (F) shows a P-map of the early vs. late learning stage contrast, demonstrating a significant cluster at p(corr) < 0.05 (outlined in black and named tROI 2) in the common states, indicating that the neural similarity in the Hp only represents the binding of common temporal context states. Panel (G) represents the early vs. late learning stage in the uncommon states, does not display any significant clusters. Gray shadow regions during the S1 presentation (−0.9–0 s) and during the S2 presentation (0–0.1 s) are not considered across the map due to the potential presence of pecking key artifacts. Panel (H) presents the neural similarity in both common and uncommon conditions in the late learning stage, in the tROI 1 of Panel (E). Panel (I) shows the neural similarity in both the early and late learning stages under the common condition in the tROI 2 of Panel (F). (n = 6; Early stage = 100 trials, Prob (S1+ choice) < 65%; Late stage = 100 trials, Prob (S1+ choice) > 90%; two sample test; *** indicates p < 0.001; mean ± sem).
Figure 6
Figure 6
The quantification of neural similarity at the high-frequency band (12–100 Hz) for all conditions. Subfigure (AI) corresponds to Figure 5, and shows a significant reduction in hippocampal neural similarity at the high-frequency (12–100 Hz) band compared to the low-frequency (1–12 Hz) band. There was no significant increase in neural similarity in the Hp for any of the conditions, including the early and late learning stages in tROI 1, or common and uncommon states in tROI 2. (n = 6; Early stage = 100 trials, Prob (S1+ choice) < 65%; Late stage = 100 trials, Prob (S1+ choice) > 90%; two sample test; “ns” indicates no significance).

References

    1. Miller K.J., Venditto S.J.C. Multi-step planning in the brain. Curr. Opin. Behav. Sci. 2021;38:29–39. doi: 10.1016/j.cobeha.2020.07.003. - DOI
    1. Brunec I.K., Momennejad I. Predictive representations in hippocampal and prefrontal hierarchies. J. Neurosci. 2022;42:299–312. doi: 10.1523/JNEUROSCI.1327-21.2021. - DOI - PMC - PubMed
    1. Stachenfeld K.L., Botvinick M.M., Gershman S.J. The hippocampus as a predictive map. Nat. Neurosci. 2017;20:1643–1653. doi: 10.1038/nn.4650. - DOI - PubMed
    1. Mehrotra D., Dubé L. Accounting for multiscale processing in adaptive real-world decision-making via the hippocampus. Front. Neurosci. 2023;17:1200842. doi: 10.3389/fnins.2023.1200842. - DOI - PMC - PubMed
    1. Balleine B.W., O’Doherty J.P. Human and rodent homologies in action control: Corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology. 2010;35:48–69. doi: 10.1038/npp.2009.131. - DOI - PMC - PubMed