Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct 1:15:704728.
doi: 10.3389/fnins.2021.704728. eCollection 2021.

Additively Combining Utilities and Beliefs: Research Gaps and Algorithmic Developments

Affiliations

Additively Combining Utilities and Beliefs: Research Gaps and Algorithmic Developments

Anush Ghambaryan et al. Front Neurosci. .

Abstract

Value-based decision making in complex environments, such as those with uncertain and volatile mapping of reward probabilities onto options, may engender computational strategies that are not necessarily optimal in terms of normative frameworks but may ensure effective learning and behavioral flexibility in conditions of limited neural computational resources. In this article, we review a suboptimal strategy - additively combining reward magnitude and reward probability attributes of options for value-based decision making. In addition, we present computational intricacies of a recently developed model (named MIX model) representing an algorithmic implementation of the additive strategy in sequential decision-making with two options. We also discuss its opportunities; and conceptual, inferential, and generalization issues. Furthermore, we suggest future studies that will reveal the potential and serve the further development of the MIX model as a general model of value-based choice making.

Keywords: MIX model; additive strategy; normalized utility; one-armed bandit task; state belief; uncertain and volatile environment; value-based decision making.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Probability of choosing an option according to classical economics view (the leftmost panel), behavioral economics view (the middle panel), and a recently developed MIX model (the rightmost panel).
FIGURE 2
FIGURE 2
(A) Trial structure. In each trial, subjects see two forms (options), a diamond and a square, each proposing a reward in euros randomly chosen from the set {2, 4, 6, 8, and 10}. After making a choice, subjects only see the chosen option on the screen, followed by a display of the outcome of the choice in the center of the screen. The average duration of a trial was 4.15 s. After displaying two available options on the screen, subjects were given 1.5 s for thinking and responding by pressing one of two instructed buttons on the keyboard, left button for choosing the option on the left side of the screen and right button for choosing the option on the right side of the screen. The outcome of the trial was displayed 1.0 s. The delay of the outcome display was 0.1–0.2 s. The inter-trial delay was 0.4–0.6 s. (B) Experimental design. The outcome could be zero or equal to the proposed reward (shown on the first screen of each trial) with some probability that subjects were not informed about. However, they could derive the reward frequencies through experience. By experimental design, 20 and 80% reward frequencies were assigned to two options and switched between them after a random number of trials (16, 20, 24, or 28). Subjects were not informed about switches but could detect them throughout the experiment based on feedbacks (outcomes). Each subject went through 19 switches of reward frequencies, which divided the task into 20 episodes (a series of trials within which no change of reward frequencies occurs).
FIGURE 3
FIGURE 3
Scheme of the computational algorithm of the MIX model.
FIGURE 4
FIGURE 4
Research directions for the loss aversion as an exemplar behavioral variation compared with choices when outcomes are presented in gain domain. Directions (a) and (b) are presented in orange and blue, respectively.

References

    1. Acerbi L., Vijayakumar S., Wolpert D. M. (2014). On the origins of suboptimality in human probabilistic inference. PLoS Comput. Biol. 10:1003661. 10.1371/journal.pcbi.1003661 - DOI - PMC - PubMed
    1. Behrens T. E. J., Woolrich M. W., Walton M. E., Rushworth M. F. S. (2007). Learning the value of information in an uncertain world. Nat. Neurosci. 10 1214–1221. 10.1038/nn1954 - DOI - PubMed
    1. Blain B., Rutledge R. B. (2020). Momentary subjective well-being depends on learning and not reward. ELife 9 1–27. 10.7554/eLife.57977 - DOI - PMC - PubMed
    1. Blankenstein N. E., van Duijvenvoorde A. C. K. (2019). Neural tracking of subjective value under riskand ambiguity in adolescence. Cogn. Affect. Behav. Neurosci. 19 1364–1378. 10.3758/s13415-019-00749-5 - DOI - PMC - PubMed
    1. Blankenstein N. E., Peper J. S., Crone E. A., van Duijvenvoorde A. C. K. (2017). Neural mechanisms underlying risk and ambiguity attitudes. J. Cogn. Neurosci. 29 1845–1859. 10.1162/jocn_a_01162 - DOI - PubMed

LinkOut - more resources