. 2017 Jun 27:8:15958.

doi: 10.1038/ncomms15958.

Reminders of past choices bias decisions for reward in humans

Aaron M Bornstein¹, Mel W Khaw², Daphna Shohamy^{3

4}, Nathaniel D Daw^{1

5}

Affiliations

¹ Neuroscience Institute, Princeton University, Washington Road, Princeton, New Jersey 08544, USA.
² Department of Economics, Columbia University, New York, New York 10027, USA.
³ Department of Psychology and Zuckerman Mind, Brain, Behavior Institute, New York, New York 10027, USA.
⁴ Kavli Center for Brain Science, Columbia University, New York, New York 10027, USA.
⁵ Department of Psychology, Princeton University, Princeton, New Jersey 08544, USA.

PMID: 28653668
PMCID: PMC5490260
DOI: 10.1038/ncomms15958

Reminders of past choices bias decisions for reward in humans

Aaron M Bornstein et al. Nat Commun. 2017.

. 2017 Jun 27:8:15958.

doi: 10.1038/ncomms15958.

Authors

Aaron M Bornstein¹, Mel W Khaw², Daphna Shohamy^{3

4}, Nathaniel D Daw^{1

5}

Affiliations

¹ Neuroscience Institute, Princeton University, Washington Road, Princeton, New Jersey 08544, USA.
² Department of Economics, Columbia University, New York, New York 10027, USA.
³ Department of Psychology and Zuckerman Mind, Brain, Behavior Institute, New York, New York 10027, USA.
⁴ Kavli Center for Brain Science, Columbia University, New York, New York 10027, USA.
⁵ Department of Psychology, Princeton University, Princeton, New Jersey 08544, USA.

PMID: 28653668
PMCID: PMC5490260
DOI: 10.1038/ncomms15958

Abstract

We provide evidence that decisions are made by consulting memories for individual past experiences, and that this process can be biased in favour of past choices using incidental reminders. First, in a standard rewarded choice task, we show that a model that estimates value at decision-time using individual samples of past outcomes fits choices and decision-related neural activity better than a canonical incremental learning model. In a second experiment, we bias this sampling process by incidentally reminding participants of individual past decisions. The next decision after a reminder shows a strong influence of the action taken and value received on the reminded trial. These results provide new empirical support for a decision architecture that relies on samples of individual past choice episodes rather than incrementally averaged rewards in evaluating options and has suggestive implications for the underlying cognitive and neural mechanisms.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

**Figure 1. Restless bandit task and re-analysis.**
(a) Four-armed bandit from Daw *et al*. (2006). Participants chose between four slot machines to receive points. (b) Payoffs. The mean amount of points paid out by each machine varied slowly over the course of the experiment. (c) Model comparison. Log Bayes factors favouring sampling over the TD model.

**Figure 2. Ticket bandit task.**
(a) The ticket-bandit task. Each slot machine (‘bandit’) delivered tickets—trial-unique photographs—associated with a dollar value—either −$5 or $5. (b) Payoff probabilities. The probability of each bandit paying out a winning ticket varied slowly over the course of the experiment. Participants were told that their total payout would be contingent both on the number of winning tickets they accrued and their ability to correctly respond on a post-task memory test asking them to recall the reward value and slot machine associated with each ticket. (c) Memory probes. Participants encountered 32 recognition memory probes. On 26 of these probe trials, participants were shown objects that were either received on a previous choice trial (‘valid’), whereas on others they were shown new objects that were not part of any previous trial (‘invalid’). Participants were asked only to perform a simple old/new recognition judgement—to press ‘yes’ if they had seen the image previously in this task and ‘no’ if they had not. After each recognition probe, the sequence of slot machine choices continued as before.

**Figure 3. Ticket bandit results.**
(a) Model comparison. Log Bayes factors favouring sampling over the TD model. (b) Impact of probes. As in standard RL models, choices are affected by previously observed rewards (black points). Here, memory probes evoking past decisions (red) also modulate choices on the subsequent choice trial. Data points are log odds of choosing the righthand option. (*P<0.05, **P<0.01 and ***P<0.001).

See this image and copyright information in PMC

Cited by

Global reward state affects learning and activity in raphe nucleus and anterior insula in monkeys.
Wittmann MK, Fouragnan E, Folloni D, Klein-Flügge MC, Chau BKH, Khamassi M, Rushworth MFS. Wittmann MK, et al. Nat Commun. 2020 Jul 28;11(1):3771. doi: 10.1038/s41467-020-17343-w. Nat Commun. 2020. PMID: 32724052 Free PMC article.
Non-action Learning: Saving Action-Associated Cost Serves as a Covert Reward.
Tanimoto S, Kondo M, Morita K, Yoshida E, Matsuzaki M. Tanimoto S, et al. Front Behav Neurosci. 2020 Sep 4;14:141. doi: 10.3389/fnbeh.2020.00141. eCollection 2020. Front Behav Neurosci. 2020. PMID: 33100979 Free PMC article.
Goal-Dependent Hippocampal Representations Facilitate Self-Control.
Edelson MG, Hare TA. Edelson MG, et al. J Neurosci. 2023 Nov 15;43(46):7822-7830. doi: 10.1523/JNEUROSCI.0951-22.2023. Epub 2023 Sep 15. J Neurosci. 2023. PMID: 37714706 Free PMC article.
Computational mechanisms underlying latent value updating of unchosen actions.
Ben-Artzi I, Kessler Y, Nicenboim B, Shahar N. Ben-Artzi I, et al. Sci Adv. 2023 Oct 20;9(42):eadi2704. doi: 10.1126/sciadv.adi2704. Epub 2023 Oct 20. Sci Adv. 2023. PMID: 37862419 Free PMC article.
Memory precision and age differentially predict the use of decision-making strategies across the lifespan.
Noh SM, Singla UK, Bennett IJ, Bornstein AM. Noh SM, et al. Sci Rep. 2023 Oct 9;13(1):17014. doi: 10.1038/s41598-023-44107-5. Sci Rep. 2023. PMID: 37813942 Free PMC article.

See all "Cited by" articles

References

1. Barto A. C. in Models of Information Processing in the Basal Ganglia (eds Houk, J. C., Davis, J. L. & Beiser, D. G.) 215–232 (MIT Press, 1995).
1. Schultz W., Montague P. R. & Dayan P. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997). - PubMed
1. Sugrue L. P., Corrado G. S. & Newsome W. T. Matching behavior and the representation of value in the parietal cortex. Science 304, 1782–1787 (2004). - PubMed
1. Daw N. D., O’Doherty J. P., Dayan P., Seymour B. & Dolan R. J. Cortical substrates for exploratory decisions in humans. Nature 441, 876–879 (2006). - PMC - PubMed
1. Behrens T. E. J., Woolrich M. W., Walton M. E. & Rushworth M. F. S. Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221 (2007). - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Reminders of past choices bias decisions for reward in humans

Affiliations

Reminders of past choices bias decisions for reward in humans

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous