Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Feb 19;5(2):e9308.
doi: 10.1371/journal.pone.0009308.

Integration of sensory and reward information during perceptual decision-making in lateral intraparietal cortex (LIP) of the macaque monkey

Affiliations

Integration of sensory and reward information during perceptual decision-making in lateral intraparietal cortex (LIP) of the macaque monkey

Alan E Rorie et al. PLoS One. .

Abstract

Single neurons in cortical area LIP are known to carry information relevant to both sensory and value-based decisions that are reported by eye movements. It is not known, however, how sensory and value information are combined in LIP when individual decisions must be based on a combination of these variables. To investigate this issue, we conducted behavioral and electrophysiological experiments in rhesus monkeys during performance of a two-alternative, forced-choice discrimination of motion direction (sensory component). Monkeys reported each decision by making an eye movement to one of two visual targets associated with the two possible directions of motion. We introduced choice biases to the monkeys' decision process (value component) by randomly interleaving balanced reward conditions (equal reward value for the two choices) with unbalanced conditions (one alternative worth twice as much as the other). The monkeys' behavior, as well as that of most LIP neurons, reflected the influence of all relevant variables: the strength of the sensory information, the value of the target in the neuron's response field, and the value of the target outside the response field. Overall, detailed analysis and computer simulation reveal that our data are consistent with a two-stage drift diffusion model proposed by Diederich and Bussmeyer for the effect of payoffs in the context of sensory discrimination tasks. Initial processing of payoff information strongly influences the starting point for the accumulation of sensory evidence, while exerting little if any effect on the rate of accumulation of sensory evidence.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. A two-alternative, forced-choice, motion discrimination task with multiple reward contingencies; sequence of events comprising a typical trial.
From left to right, trials begin with the onset of a fixation point. Two saccade targets appear and then change color indicating the magnitude of the reward available for correctly choosing that target. A blue target indicates a low magnitude (L) reward, while a red target indicates a high magnitude (H) reward. The four reward conditions are depicted vertically—LL,HH, LH and HL, from top to bottom. The visual motion stimulus is centered on the fixation point. Following offset of the motion stimulus, the monkey maintains fixation during a variable delay period after which the fixation point disappears, cueing the monkey to report his decision with a saccade to the target corresponding to the perceived direction of motion. If the monkey chooses the correct direction of motion, he receives the reward indicated by the color of the chosen target.
Figure 2
Figure 2. Relative reward biases choice.
A–D. Psychometric functions (PMFs) describing each monkey's probability of choosing T1 as a function of motion coherence. Motion coherence is denoted with a magnitude indicating the strength of the motion and a sign indicating its direction. Positive coherence denotes motion towards T1 while negative coherence denotes motion towards T2. Separate PMFs are plotted for each reward condition (HH, red; LL, blue; HL, black; LH, green). Circles depict the observed proportion of T1 choices, and sigmoidal curves are fit quantitatively with logistic regression. A–B. Results from one representative experiment for monkey A and monkey T, respectively. C–D. Average PMFs across all behavioral sessions for monkeys A (n = 33) and T (n = 24), respectively.
Figure 3
Figure 3. LIP represents the absolute value of the option in the RF.
A. Average data from monkey A (n = 51 cells). B. Average data from monkey T (n = 31 cells). Mean LIP firing rate as a function of time, for the HH (red) and LL (blue) reward conditions. Data are plotted separately for T1 (solid) and T2 (dashed) choices. 0–250 ms is the target epoch in which the blank targets are presented; 250–500 ms is the reward epoch in which the targets change color to cue the reward condition; 500–1000 ms is the motion epoch in which the random-dot motion stimulus is presented; 1000–1250 ms is the early segment of the delay epoch; −350–0 ms (in the right panel) is the late delay epoch immediately preceding the saccade. Any difference between the red and blue curves indicates an effect of the absolute value of the option in the RF.
Figure 4
Figure 4. LIP represents the relative value of the option in the RF.
A. Average data from monkey A (n = 51 cells). B. Average data from monkey T (n = 31 cells). Mean LIP firing rate as a function of time, for the HH (red) and HL (black) reward conditions. HH curves are the same as in Figure 3A–B. Data are plotted separately for T1 (solid) and T2 (dashed) choices. In the left panels, responses are aligned to the target onset, while in the right panels, responses are aligned to the saccade time. Any difference between the red and black curves indicates an effect of the relative value of the option in the RF.
Figure 5
Figure 5. A second look at the relative value effect.
A. Average data from monkey A (n = 51 cells). B. Average data from monkey T (n = 31 cells). Mean LIP firing rate as a function of time, for the LL (blue) and LH (green) reward conditions. LL curves are the same as in Figure 3A–B. Data are plotted separately for T1 (solid) and T2 (dashed) choices. In the left panels, responses are aligned to the target onset, while in the right panels, responses are aligned to the saccade time. Any difference between the blue and green curves indicates an effect of the relative value of the option in the RF.
Figure 6
Figure 6. Quantifying the dynamics of absolute value, relative value, motion coherence and choice.
A. Average regression coefficients from monkey A. B. Same data for monkey T. Mean values (±sem) of βcoh (black), βt1 (red), βt2 (blue) and βchoice (green) coefficients as a function of time. These coefficients represent the average effect of motion coherence, T1 value, T2 value, and choice on firing rate. They are fit by applying Equation 3 to the average firing rate slid in 1 ms intervals across the duration of the trial. Window width  = 50 ms.
Figure 7
Figure 7. Reward and motion information are multiplexed at the single neuron level.
The bars depict the percentage of neurons that are modulated significantly by one, two or three model parameters: T1val, T2val or coherence. A. Data from the second half of the motion epoch. B. Data from the early delay epoch. Red bars: monkey A. Blue bars: monkey B.
Figure 8
Figure 8. Possible mechanisms underlying the effect of imbalanced payoffs on behavioral choice.
A. Idealized LIP activity as a function of time during the motion epoch for one spatial location. Time zero indicates the initiation of the motion stimulus. In the model, motion evidence supporting a decision accumulates until it reaches a bound indicated by the dashed lines. B. “Two-stage” mechanism. In the first stage, information about payoff size establishes the initial offset of the accumulator, which, in the imbalanced payoff conditions (HL and LH), is biased in favor of the spatial location of the high payoff target. In the second stage, motion information accumulates to a fixed bound, as in A. C. “Drift rate” mechanism. The accumulator offset is identical for all payoff conditions, but payoff information is incorporated into the drift rate of the accumulation process, again biasing the process in favor of the high payoff target. D. Payoff information affects neither the offset nor the drift rate, but rather exerts its effect through adjustment of the decision bound. HL  =  high-low reward condition (large payoff target in the LIP response field; small payoff target in the opposite hemifield). LH  =  low-high reward condition (small payoff target in the LIP response field).
Figure 9
Figure 9. Unbalanced rewards results in an offset to the starting point of the accumulation process.
A. Average LIP firing rate (±sem) as a function of time for the HL (black) and LH (green) reward conditions (monkey A, T1 choices only). Activity is averaged across all coherences. The black and green curves are replotted from Figs. 4A and 5A (respectively), expanding the horizontal scale to emphasize the interval at and following the onset of stimulus motion (time 500). B. Equivalent data for monkey T. Traces are replotted from Figs. 4B and 5B.
Figure 10
Figure 10. Rate of accumulation for the unbalanced reward conditions.
A–B. Normalized firing rates (±sem) for a single motion condition (+48% coherence), averaged across the population of neurons from monkey A and monkey T, respectively. All data are from trials ending in a T1 choice. C–D. As in A–B, but averaged across all positive coherences. The HL condition is depicted in black, the LH condition in green. Time zero is the time of the initial “dip” in firing rate following onset of the motion signal, identified separately for each neuron (see Methods).
Figure 11
Figure 11. A model account of the neural and behavioral observations in Monkey A.
A. Empirically observed behavioral data from monkey A (left) and simulated behavioral results (right) of the competing accumulator model described in the text. Four colors indicate the four reward conditions: HH (red), LL (blue), HL (black; LH (green). B. Empirically observed physiological data from monkey A (left) and simulated physiological results (right) for the imbalanced reward conditions. Solid lines indicated trials ending in T1 choices; dashed lines illustrate trials ending in T2 choices. Color code is the same as in the top panels. C. Empirically observed physiological data from monkey A (left) and simulated physiological results (right) for the balanced reward conditions. Solid and dashed lines, and the color code, are the same as in the preceding panels. Parameter values used in the reported fits are as follows: b = 10; a = 0.5; σb = 14; σw = 1, θ = 22. Values are in units of seconds and Hertz.

References

    1. Diederich A, Busemeyer JR. Modeling the effects of payoff on response bias in a perceptual discrimination task: bound-change, drift-rate-change, or two-stage-processing hypothesis. Percept Psychophys. 2006;68:194–207. - PubMed
    1. Green DM, Swets JA. New York: Wiley; 1966. Signal detection and psychophysics.
    1. Laming D. New York: Academic Press; 1968. Information theory of choice reaction times.
    1. Link S, Heath R. A sequential theory of psychological discrimination. Psychometrika. 1975;40:77–105.
    1. Ratcliff R. A theory of memory retrieval. Psychol Rev. 1978;85:59–108.

Publication types