Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 May;63(5):863-91.
doi: 10.1080/17470210903091643. Epub 2009 Sep 10.

Do humans produce the speed-accuracy trade-off that maximizes reward rate?

Affiliations

Do humans produce the speed-accuracy trade-off that maximizes reward rate?

Rafal Bogacz et al. Q J Exp Psychol (Hove). 2010 May.

Abstract

In this paper we investigate trade-offs between speed and accuracy that are produced by humans when confronted with a sequence of choices between two alternatives. We assume that the choice process is described by the drift diffusion model, in which the speed-accuracy trade-off is primarily controlled by the value of the decision threshold. We test the hypothesis that participants choose the decision threshold that maximizes reward rate, defined as an average number of rewards per unit of time. In particular, we test four predictions derived on the basis of this hypothesis in two behavioural experiments. The data from all participants of our experiments provide support only for some of the predictions, and on average the participants are slower and more accurate than predicted by reward rate maximization. However, when we limit our analysis to subgroups of 30-50% of participants who earned the highest overall rewards, all the predictions are satisfied by the data. This suggests that a substantial subset of participants do select decision thresholds that maximize reward rate. We also discuss possible reasons why the remaining participants select thresholds higher than optimal, including the possibility that participants optimize a combination of reward rate and accuracy or that they compensate for the influence of timing uncertainty, or both.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Optimal performance curves. Horizontal axes show the error rate and vertical axes show the decision time (DT) normalized by total delay Dtotal. The thick line (identical in both panels) is the optimal performance curve for the reward rate (RR). The thin lines show the generalized optimal performance curves for reward accuracy (panel a) and modified RR (panel b). Each curve corresponds to a different value of q ranging from 0 (when it reduces to the optimal performance curve for RR, shown in the thick lines) to 0.5 (top curves) in steps of 0.1.
Figure 2
Figure 2
Average error rate (ER) and reaction time (RT) in units of seconds for all twenty participants of Experiment 1 and all delay conditions. Delay conditions are indicated on x-axes – labels correspond to conditions (1) D=0.5s; (2) D=1s; (3) D=2s; and (4) D=0.5s, Dp=1.5s. Error bars consider comparisons between two adjacent conditions (inspired by Masson & Loftus, 2003). For example, in panel a, error bar for condition D=0.5 and the left error bar for condition D=1 consider comparison between these two conditions. The height of the error bars corresponds to standard error of the differences between ERs (or RTs in panel b) in the two conditions. Specifically, the difference is first calculated for each participant, and then the standard error of the differences is calculated across participants; both error bars are equal to this standard error, so they have equal heights. These error bars have standard interpretation: if two adjacent bars are different from each other by more than approximately two heights of corresponding error bars, then ERs (or RTs) are significantly different according to the paired t-test.
Figure 3
Figure 3
Quantile probability plots showing the fits of the pure (a) and the extended (b) drift diffusion model (DDM) to behavioural data from a sample participant. Open circles correspond to the behavioural data. The horizontal axes indicate the probability of error in the left parts of the panels, and probability of correct choice in the right parts (see labels on x-axes). In each part, the four columns of circles correspond to the four delay conditions in the experiment. In each column of circles, the vertical positions of the five circles indicate the quantiles of reaction times. Small filled circles visualise the corresponding predictions for the DDM. These are connected by lines to make the patterns they create more visible. For clarity, error bars with confidence intervals for quantiles of reaction times are not shown here. They are plotted for the same participant in Figure 5 of Bogacz et al. (2006), which shows that the confidence intervals are very large for the error trials (up to 1.48s for the 0.9 quantile in D=0.5, Dp=1.5 condition) because of the small number of such trials. The estimated parameters of the models (for noise parameter fixed at c=0.1): a) pure DDM: T0=0.346, A=0.219, z1=0.0398, z2=0.0535, z3=0.0610, z4=0.0682; b) extended DDM: T0= 0.372, st=0.084, mA=0.344, sA=0.152, sx=0.044, z1=0.0503, z2=0.0592, z3=0.0687, z4=0.0816.
Figure 4
Figure 4
Comparison of parameters T0, ã and estimated by fitting pure (vertical axes) and extended (horizontal axes) drift diffusion model. In the left and the central panels dots correspond to individual participants; in the right panel different experimental delay conditions are indicated by different symbols (see legend).
Figure 5
Figure 5
Comparison of participants' and optimal thresholds in Experiment 1. The left column of panels (a, c, e) compares estimates and predictions of the pure drift diffusion model (DDM), while the right column of panels (b, d, f) – of the extended DDM. In panels a-d, horizontal axes correspond to participants' normalized thresholds, the optimal normalized thresholds are along the vertical axes, and the dotted line is the identity line. Different experimental delay conditions are indicated by different symbols (see legend). Note different scales between rows of panels. Panels a and b show a sample participant with the best fit; panels c and d show data of all 20 participants. Panels e and f show the mean participants' (black bars) and optimal (white bars) thresholds averaged across participants for different delay conditions (indicated on the horizontal axis). Error bars show standard error (there are error bars on bars corresponding to optimal thresholds, because different participants have different optimal thresholds). Stars indicate the level of significance of the difference between participants' and optimal thresholds (paired t-test): one star denotes p < 0.05, two stars denote p < 0.01, three stars denote p < 0.001.
Figure 6
Figure 6
Mean error rate and mean reaction time (in seconds) for all 60 participants in Experiment 2 for each delay condition averaged across difficulty conditions. Error bars as in Figure 2.
Figure 7
Figure 7
Comparison of participants' and optimal thresholds in Experiment 2. Left panels (a and c) show findings for the easy condition, right panels (b and d) for the difficult condition. In panels a and b, participants' normalized thresholds are shown on horizontal axes, optimal normalized thresholds on the vertical axes, and the dotted line is the identity line. Different experimental delay conditions are indicated by different shapes (see legend). Note different scales. Panels c and d show the mean participants' (black bars) and optimal (white bars) thresholds averaged across participants for different delay conditions (indicated on the horizontal axis). Error bars show standard error, and stars indicate the level of significance of the difference between participants' and optimal thresholds (paired t-test): one star denotes p < 0.05, two stars denote p < 0.01, three stars denote p < 0.001.
Figure 8
Figure 8
Performance of participants with reward scores higher than the median reward score for Experiment 2. a) Relationship between theory match error (x-axis) and achieved reward score (y-axis). Each symbol corresponds to one participant; genders as indicated in key. b, c) Mean error rate and reaction time (in units of seconds) averaged across difficulty conditions. Error bars as in Figure 2. d, e) mean participants' (black bars) and optimal (white bars) normalized thresholds averaged across participants for different delay conditions (indicated on the horizontal axis). Panel d shows data for easy conditions; panel e – for difficult conditions. Error bars show standard error, and stars indicate the level of significance of the difference between participants' and optimal thresholds (paired t-test): one star denotes p < 0.05, two stars denote p < 0.01, three stars denote p < 0.001.
Figure 9
Figure 9
Optimal performance curves. In all panels except b, horizontal axes show the error rate (ER) and vertical axes show the normalized decision time (DT), i.e. DT divided by total delay Dtotal. Note the differences in scales between panels. Solid black curves show the theoretical prediction based on the assumption that participants maximize the reward rate. a) Normalized DTs as a function for ER plotted for one participant of Experiment 2. Each circle corresponds to one experimental condition. For two conditions the normalized DTs and ERs were very close, so two circles overlap (in the bottom right corner of the panel). b) Histogram of the correlations between normalized DT and the normalized DT predicted by the optimal performance curve. Each correlation is calculated for one participant of Experiment 2. Shaded bars indicate the correlations that were statistically significant (p<0.05). In panels c-e, the bars show the mean normalized DT averaged across participants' conditions with ERs falling into the given interval (from Experiments 1 and 2). The error bars indicate standard error. There are no error bars for some bins, as there was just one condition falling into this bin, or DTs for participants' conditions falling into this bin were all very close to 0. c) The bars are based on all participants. d) The participants are divided into three groups on the basis of their reward scores and bars in each of the three panels are based on the data from the corresponding group of participants (see titles of the panels). e) Data from conditions with Dp=0 based on subgroups of participants as in panel d. Best fitting optimal performance curves for RA and RRm are also shown (see legend).
Figure 10
Figure 10
Changes in participants' performance within single blocks of trials. In all panels horizontal axes show the trial number within a block – each point corresponds to a bin of 5 trials. Vertical axis shows the difference in the mean reaction time (RT) between delay condition D = 2, and D = 0.5 averaged over: 5 trials in a given bin and both difficulty conditions. Error bars indicate standard error. Panel a shows data averaged across participants of Experiments 1 and 2 with reward scores higher than the median for each of experiment, and panel b shows data averaged across participants with reward scores lower than the median.

References

    1. Akaike H. Likelihood of a model and information criteria. Journal of Econometrics. 1981;16:3–14.
    1. Bogacz R, Brown E, Moehlis J, Holmes P, Cohen JD. The physics of optimal decision making: A formal analysis of models of performance in two-alternative forced choice tasks. Psychological Review. 2006;113:700–765. - PubMed
    1. Bogacz R, Cohen JD. Parameterization of connectionist models. Behavioral Research Methods, Instruments, and Computers. 2004;36:732–741. - PubMed
    1. Brainard DH. The Psychophysics Toolbox. Spatial Vision. 1997;10:433–436. - PubMed
    1. Brown E, Holmes P. Modeling a simple choice task: stochastic dynamics of mutually inhibitory neural groups. Stochastics and Dynamics. 2001;1:159–191.

Publication types

LinkOut - more resources