Distributional Reinforcement Learning in the Brain
- PMID: 33092893
- PMCID: PMC8073212
- DOI: 10.1016/j.tins.2020.09.004
Distributional Reinforcement Learning in the Brain
Abstract
Learning about rewards and punishments is critical for survival. Classical studies have demonstrated an impressive correspondence between the firing of dopamine neurons in the mammalian midbrain and the reward prediction errors of reinforcement learning algorithms, which express the difference between actual reward and predicted mean reward. However, it may be advantageous to learn not only the mean but also the complete distribution of potential rewards. Recent advances in machine learning have revealed a biologically plausible set of algorithms for reconstructing this reward distribution from experience. Here, we review the mathematical foundations of these algorithms as well as initial evidence for their neurobiological implementation. We conclude by highlighting outstanding questions regarding the circuit computation and behavioral readout of these distributional codes.
Keywords: artificial intelligence; deep neural networks; dopamine; machine learning; population coding; reward.
Copyright © 2020 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Figures




References
-
- LeCun Y et al. (2015) Deep learning. Nature 521, 436–444 - PubMed
-
- Mnih V et al. (2015) Human-level control through deep reinforcement learning. Nature 518, 529–533 - PubMed
-
- Silver D et al. (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 - PubMed
-
- Botvinick M et al. (2019) Reinforcement Learning, Fast and Slow. Trends Cogn. Sci. (Regul. Ed.) 23, 408–422 - PubMed
-
- Hassabis D et al. (2017) Neuroscience-Inspired Artificial Intelligence. Neuron 95, 245–258 - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources