Short-term memory traces for action bias in human reinforcement learning

Rafal Bogacz¹, Samuel M McClure, Jian Li, Jonathan D Cohen, P Read Montague

Affiliations

PMID: 17459346
DOI: 10.1016/j.brainres.2007.03.057

Short-term memory traces for action bias in human reinforcement learning

Rafal Bogacz et al. Brain Res. 2007.

. 2007 Jun 11:1153:111-21.

doi: 10.1016/j.brainres.2007.03.057. Epub 2007 Mar 24.

Authors

Rafal Bogacz¹, Samuel M McClure, Jian Li, Jonathan D Cohen, P Read Montague

Affiliation

¹ Center for the Study of Brain, Mind and Behavior, Princeton University, Princeton, NJ 08544, USA. R.Bogacz@bristol.ac.uk

PMID: 17459346
DOI: 10.1016/j.brainres.2007.03.057

Abstract

Recent experimental and theoretical work on reinforcement learning has shed light on the neural bases of learning from rewards and punishments. One fundamental problem in reinforcement learning is the credit assignment problem, or how to properly assign credit to actions that lead to reward or punishment following a delay. Temporal difference learning solves this problem, but its efficiency can be significantly improved by the addition of eligibility traces (ET). In essence, ETs function as decaying memories of previous choices that are used to scale synaptic weight changes. It has been shown in theoretical studies that ETs spanning a number of actions may improve the performance of reinforcement learning. However, it remains an open question whether including ETs that persist over sequences of actions allows reinforcement learning models to better fit empirical data regarding the behaviors of humans and other animals. Here, we report an experiment in which human subjects performed a sequential economic decision game in which the long-term optimal strategy differed from the strategy that leads to the greatest short-term return. We demonstrate that human subjects' performance in the task is significantly affected by the time between choices in a surprising and seemingly counterintuitive way. However, this behavior is naturally explained by a temporal difference learning model which includes ETs persisting across actions. Furthermore, we review recent findings that suggest that short-term synaptic plasticity in dopamine neurons may provide a realistic biophysical mechanism for producing ETs that persist on a timescale consistent with behavioral observations.

PubMed Disclaimer

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- ClinicalKey
- Elsevier Science
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Short-term memory traces for action bias in human reinforcement learning

Affiliation

Short-term memory traces for action bias in human reinforcement learning

Authors

Affiliation

Abstract

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources