Modeling awake hippocampal reactivations with model-based bidirectional search

Mehdi Khamassi¹, Benoît Girard²

Affiliations

¹ Institute of Intelligent Systems and Robotics (ISIR), Sorbonne Université and CNRS (Centre National de la Recherche Scientifique), 75005, Paris, France. Mehdi.Khamassi@sorbonne-universite.fr.
² Institute of Intelligent Systems and Robotics (ISIR), Sorbonne Université and CNRS (Centre National de la Recherche Scientifique), 75005, Paris, France.

PMID: 32065253
DOI: 10.1007/s00422-020-00817-x

Modeling awake hippocampal reactivations with model-based bidirectional search

Mehdi Khamassi et al. Biol Cybern. 2020 Apr.

. 2020 Apr;114(2):231-248.

doi: 10.1007/s00422-020-00817-x. Epub 2020 Feb 17.

Authors

Mehdi Khamassi¹, Benoît Girard²

Affiliations

¹ Institute of Intelligent Systems and Robotics (ISIR), Sorbonne Université and CNRS (Centre National de la Recherche Scientifique), 75005, Paris, France. Mehdi.Khamassi@sorbonne-universite.fr.
² Institute of Intelligent Systems and Robotics (ISIR), Sorbonne Université and CNRS (Centre National de la Recherche Scientifique), 75005, Paris, France.

PMID: 32065253
DOI: 10.1007/s00422-020-00817-x

Abstract

Hippocampal offline reactivations during reward-based learning, usually categorized as replay events, have been found to be important for performance improvement over time and for memory consolidation. Recent computational work has linked these phenomena to the need to transform reward information into state-action values for decision making and to propagate it to all relevant states of the environment. Nevertheless, it is still unclear whether an integrated reinforcement learning mechanism could account for the variety of awake hippocampal reactivations, including variety in order (forward and reverse reactivated trajectories) and variety in the location where they occur (reward site or decision-point). Here, we present a model-based bidirectional search model which accounts for a variety of hippocampal reactivations. The model combines forward trajectory sampling from current position and backward sampling through prioritized sweeping from states associated with large reward prediction errors until the two trajectories connect. This is repeated until stabilization of state-action values (convergence), which could explain why hippocampal reactivations drastically diminish when the animal's performance stabilizes. Simulations in a multiple T-maze task show that forward reactivations are prominently found at decision-points while backward reactivations are exclusively generated at reward sites. Finally, the model can generate imaginary trajectories that are not allowed to the agent during task performance. We raise some experimental predictions and implications for future studies of the role of the hippocampo-prefronto-striatal network in learning.

Keywords: Computational neuroscience; Hippocampal replay; Navigation; Reinforcement learning.

PubMed Disclaimer

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Springer

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Modeling awake hippocampal reactivations with model-based bidirectional search

Affiliations

Modeling awake hippocampal reactivations with model-based bidirectional search

Authors

Affiliations

Abstract

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources