Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Mar 5;28(5):795-802.e6.
doi: 10.1016/j.cub.2018.01.071. Epub 2018 Feb 15.

Dynamic Interplay of Value and Sensory Information in High-Speed Decision Making

Affiliations

Dynamic Interplay of Value and Sensory Information in High-Speed Decision Making

Kivilcim Afacan-Seref et al. Curr Biol. .

Abstract

In dynamic environments, split-second sensorimotor decisions must be prioritized according to potential payoffs to maximize overall rewards. The impact of relative value on deliberative perceptual judgments has been examined extensively [1-6], but relatively little is known about value-biasing mechanisms in the common situation where physical evidence is strong but the time to act is severely limited. In prominent decision models, a noisy but statistically stationary representation of sensory evidence is integrated over time to an action-triggering bound, and value-biases are affected by starting the integrator closer to the more valuable bound. Here, we show significant departures from this account for humans making rapid sensory-instructed action choices. Behavior was best explained by a simple model in which the evidence representation-and hence, rate of accumulation-is itself biased by value and is non-stationary, increasing over the short decision time frame. Because the value bias initially dominates, the model uniquely predicts a dynamic "turn-around" effect on low-value cues, where the accumulator first launches toward the incorrect action but is then re-routed to the correct one. This was clearly exhibited in electrophysiological signals reflecting motor preparation and evidence accumulation. Finally, we construct an extended model that implements this dynamic effect through plausible sensory neural response modulations and demonstrate the correspondence between decision signal dynamics simulated from a behavioral fit of that model and the empirical decision signals. Our findings suggest that value and sensory information can exert simultaneous and dynamically countervailing influences on the trajectory of the accumulation-to-bound process, driving rapid, sensory-guided actions.

PubMed Disclaimer

Conflict of interest statement

Declaration of Interests:

The authors declare no competing interests.

Figures

Figure 1
Figure 1. Behavioral task and data
(A) Upon achieving fixation, subjects first viewed two peripheral, equiluminant green and cyan discs (“targets”) indicating which color alternative maps to which response hand on the current trial. After a 777–824 ms delay, the fixation point abruptly changed its color with equal probability to one of the target colors, demanding immediate execution of the corresponding response within a strict deadline of 325 ms. Depending on the cued color, a correct response earned the subject five or forty points. Here, a low-value correct trial is illustrated. Within each 120-trial block, the color-to-value association remained fixed but the mapping of color to response changed pseudorandomly on a trial-by-trial basis. (B) Upper panel: Reaction time (RT) distributions for both correct (thick lines) and error responses (thin lines) on trials with high-value cues (red) and low-value cues (green). The 325-ms deadline is indicated by the vertical gray line. Mean RTs are indicated by symbols placed over the respective RT distributions. Error bars indicate S.E.M., which extends little beyond the symbol sizes. Lower panel: Conditional accuracy functions quantifying accuracy as a function of RT. Action choices were strongly value-biased for fast responses but became increasingly sensory-based with increasing RT, eventually converging on almost perfect accuracy.
Figure 2
Figure 2. Comparison of alternative accumulation-to-bound models
(A–D) Schematics illustrating the four alternative, simplified models for capturing the value biases and relatively fast error RTs observed in our behavioral data. In the standard models typically examined (A and B), sensory evidence (SE) is stationary (additive noise not shown) and thus the decision variable (DV) on average increases linearly with time. In the models with increasing evidence (C and D), drift rate (represented directly as mean sensory evidence here for clarity of presentation) increases linearly over time, so that the DV grows with a curved (quadratic) path, and had no starting point variability. In the starting point bias models, the initial DV value is shifted towards the high-value bound with no change to the evidence (A and C). In the drift rate bias models, the sensory evidence driving the accumulation process is offset in the direction of the higher value (B and D). (E) Mean Bayes Information Criterion (BIC) values quantifying goodness of fit for the four alternative models, arranged in the same order as panels A–D. The smallest BIC values for the drift rate bias model with increasing evidence signifies that it provides the best fit to behavior. Average parameter values are listed in Table S1 and simulated RT distributions and conditional accuracy functions in Figure S2. Adding either a difference in non-decision time between high- and low-value responses or drift rate variability to the standard stationary-evidence models did not change these results (see Figure S1). (F–I) Simulated decision variable dynamics over time starting from the onset of accumulation for correct high-value-cued trials, correct low-value-cued trials with relatively fast and relatively slow RT, and incorrect low-value-cued trials, using parameters estimated from the behavioral fit of each competing model (E). Starting point biases were reflected in positive and negative shifts in DV starting level for high- and low-value cues, respectively (‘1’, F,H). Starting point variability was reflected in error trials being associated with a starting level closer to the error bound, and higher starting levels for faster correct trials (‘2’, F,G). A biased and increasing drift rate was uniquely associated with a “turn-around” effect on slower, correct responses to low-value cues due to drift rate shifting from an initially negative to a positive value due to the growing sensory influence. With stochastic variation, this initial downward trajectory often goes far enough to cross the lower bound, resulting in an error. SPB-VS: starting point bias model with variable starting point; DRB-VS: drift rate bias model with variable starting point; SPB-IE: starting point bias model with increasing evidence; DRB-IE: drift rate bias model with increasing evidence. See also Figure S1, Figure S2 and Table S1.
Figure 3
Figure 3. Electrophysiological signals reflecting relative motor preparation and evidence accumulation
(A) Empirically measured differential motor preparation reflected in the lateralized readiness potential (LRP), for the same 4 trial conditions as simulated in Figure 2F–I. Upward deflections reflect preparation towards the correct response. Key signatures of all four simplified models (Figure 2) are exhibited. Right: scalp potential distribution of the difference between left-response and right-response trials, illustrating the LRP topography. (B) The centro-parietal positivity (CPP), which reflects a motor-independent representation of cumulative evidence at a more abstract level, plotted for the same conditions. Consistent with initial accumulation of ‘wrong’ evidence followed by a gradual take-over of correct evidence (reflected in the turn-around in LRP), the CPP exhibited an initial buildup, then a momentary lull (roughly coincident with LRP crossing back over its baseline level), and resumed buildup particularly for the conditions of incorrect and slow correct low-value cues. In the case of errors, the initial buildup of “wrong” evidence was enough to cross the error bound. Note that the fact that the dip is most strongly exhibited for slower correct and fast incorrect low-value cued trials – the two conditions with longest and shortest RT, respectively – rules out the possibility that these patterns arise from differences in the temporal overlap of non-decision related stimulus- and response-locked processes due to RT differences. Figure S3 repeats this analysis for more RT bins to highlight graded nature of effects.
Figure 4
Figure 4. Value-Modulated Sensory Response (VMSR) model
(A) Idealized sensory responses of two neural populations with a tuning preference for each color alternative, forming the basis of the VMSR model. Response profiles trace the expectation (i.e., trial-average) of activity over time, and additive noise is applied on each single trial. Taking the example of a presented cyan cue, the “preferred” (cyan neurons) and “unpreferred” (green neurons) sensory populations are initially excited equally strongly under value-neutral conditions (left), but selectivity gradually develops as the “unpreferred” population activity drops away. The differential evidence (gray trace) thus increases from zero to a stable positive level. When cyan is the higher-value color (middle), the cyan neurons’ response is enhanced and the green neurons’ response attenuated, which results in a positive offset in the differential evidence (red trace). Meanwhile when cyan is the lower-value color (right), the modulations are reversed so that the differential evidence is offset negatively (green trace) as in the abstract version of the model (Figure 2D). (B) Simulation of average Decision Variable (DV) trajectories for the VMSR model using parameters from fits to the behavioral data (see Methods). Both cue-locked and response-locked DV waveforms, simulated for each individual and then averaged, match the empirically observed dynamics of the LRP (Figure 3A) including the distinctive “turn-around” effect. (C) The motor-independent accumulator signal (CPP) was simulated from the same behavioral fit of the VMSR model by taking the absolute value of the cumulative differential sensory evidence (|ΣSE|) on each single trial. Like the empirical CPP (Figure 3B), the simulated CPP trace for incorrect and slow correct low-value cues undergoes an initial buildup followed by a lull and then a second phase of buildup. Note that although the simulated CPP traces on single trials with turn-around effects dip down to zero at the time point where differential cumulative evidence passes from negative to positive, averaging across trials and subjects significantly blunts this dip in the average simulated traces (see also Figures S2, S3 and S4).

References

    1. Feng S, Holmes P, Rorie A, Newsome WT. Can Monkeys Choose Optimally When Faced with Noisy Stimuli and Unequal Rewards? PLoS Comput Biol. 2009;5 - PMC - PubMed
    1. Simen P, Contreras D, Buck C, Hu P, Holmes P, Cohen JD. Reward rate optimization in two-alternative decision making: empirical tests of theoretical predictions. J Exp Psychol Hum Percept Perform. 2009;35:1865–1897. - PMC - PubMed
    1. Rorie AE, Gao J, McClelland JL, Newsome WT. Integration of Sensory and Reward Information during Perceptual Decision-Making in Lateral Intraparietal Cortex (LIP) of the Macaque Monkey. PLOS ONE. 2010;5:e9308. - PMC - PubMed
    1. Summerfield C, Koechlin E. Economic Value Biases Uncertain Perceptual Choices in the Parietal and Prefrontal Cortices. Front Hum Neurosci. 2010;4 - PMC - PubMed
    1. Mulder MJ, Wagenmakers EJ, Ratcliff R, Boekel W, Forstmann BU. Bias in the Brain: A Diffusion Model Analysis of Prior Probability and Potential Payoff. J Neurosci. 2012;32:2335–2343. - PMC - PubMed

Publication types

LinkOut - more resources