Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jun 27:8:15958.
doi: 10.1038/ncomms15958.

Reminders of past choices bias decisions for reward in humans

Affiliations

Reminders of past choices bias decisions for reward in humans

Aaron M Bornstein et al. Nat Commun. .

Abstract

We provide evidence that decisions are made by consulting memories for individual past experiences, and that this process can be biased in favour of past choices using incidental reminders. First, in a standard rewarded choice task, we show that a model that estimates value at decision-time using individual samples of past outcomes fits choices and decision-related neural activity better than a canonical incremental learning model. In a second experiment, we bias this sampling process by incidentally reminding participants of individual past decisions. The next decision after a reminder shows a strong influence of the action taken and value received on the reminded trial. These results provide new empirical support for a decision architecture that relies on samples of individual past choice episodes rather than incrementally averaged rewards in evaluating options and has suggestive implications for the underlying cognitive and neural mechanisms.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1. Restless bandit task and re-analysis.
(a) Four-armed bandit from Daw et al. (2006). Participants chose between four slot machines to receive points. (b) Payoffs. The mean amount of points paid out by each machine varied slowly over the course of the experiment. (c) Model comparison. Log Bayes factors favouring sampling over the TD model.
Figure 2
Figure 2. Ticket bandit task.
(a) The ticket-bandit task. Each slot machine (‘bandit’) delivered tickets—trial-unique photographs—associated with a dollar value—either −$5 or $5. (b) Payoff probabilities. The probability of each bandit paying out a winning ticket varied slowly over the course of the experiment. Participants were told that their total payout would be contingent both on the number of winning tickets they accrued and their ability to correctly respond on a post-task memory test asking them to recall the reward value and slot machine associated with each ticket. (c) Memory probes. Participants encountered 32 recognition memory probes. On 26 of these probe trials, participants were shown objects that were either received on a previous choice trial (‘valid’), whereas on others they were shown new objects that were not part of any previous trial (‘invalid’). Participants were asked only to perform a simple old/new recognition judgement—to press ‘yes’ if they had seen the image previously in this task and ‘no’ if they had not. After each recognition probe, the sequence of slot machine choices continued as before.
Figure 3
Figure 3. Ticket bandit results.
(a) Model comparison. Log Bayes factors favouring sampling over the TD model. (b) Impact of probes. As in standard RL models, choices are affected by previously observed rewards (black points). Here, memory probes evoking past decisions (red) also modulate choices on the subsequent choice trial. Data points are log odds of choosing the righthand option. (*P<0.05, **P<0.01 and ***P<0.001).

Similar articles

Cited by

References

    1. Barto A. C. in Models of Information Processing in the Basal Ganglia (eds Houk, J. C., Davis, J. L. & Beiser, D. G.) 215–232 (MIT Press, 1995).
    1. Schultz W., Montague P. R. & Dayan P. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997). - PubMed
    1. Sugrue L. P., Corrado G. S. & Newsome W. T. Matching behavior and the representation of value in the parietal cortex. Science 304, 1782–1787 (2004). - PubMed
    1. Daw N. D., O’Doherty J. P., Dayan P., Seymour B. & Dolan R. J. Cortical substrates for exploratory decisions in humans. Nature 441, 876–879 (2006). - PMC - PubMed
    1. Behrens T. E. J., Woolrich M. W., Walton M. E. & Rushworth M. F. S. Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221 (2007). - PubMed

Publication types