Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jul;114(1):99-113.
doi: 10.1152/jn.00793.2014. Epub 2015 May 6.

Confidence estimation as a stochastic process in a neurodynamical system of decision making

Affiliations

Confidence estimation as a stochastic process in a neurodynamical system of decision making

Ziqiang Wei et al. J Neurophysiol. 2015 Jul.

Abstract

Evaluation of confidence about one's knowledge is key to the brain's ability to monitor cognition. To investigate the neural mechanism of confidence assessment, we examined a biologically realistic spiking network model and found that it reproduced salient behavioral observations and single-neuron activity data from a monkey experiment designed to study confidence about a decision under uncertainty. Interestingly, the model predicts that changes of mind can occur in a mnemonic delay when confidence is low; the probability of changes of mind increases (decreases) with task difficulty in correct (error) trials. Furthermore, a so-called "hard-easy effect" observed in humans naturally emerges, i.e., behavior shows underconfidence (underestimation of correct rate) for easy or moderately difficult tasks and overconfidence (overestimation of correct rate) for very difficult tasks. Importantly, in the model, confidence is computed using a simple neural signal in individual trials, without explicit representation of probability functions. Therefore, even a concept of metacognition can be explained by sampling a stochastic neural activity pattern.

Keywords: decision confidence; lateral intraparietal cortex; line-attractor neural model.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Schematic description of the decision task and model architecture. A: procedure of a simulated fixed-duration discrimination task. Following a fixation period, two targets (large red circles) appear, indicating the alternative choices. A random-dots motion stimulus is presented, followed by a delay period. A saccade to one of the alternatives indicates the decision at the end of the delay. In some trials, a sure target (blue circle) is shown after the motion offset, and choosing it leads to a certain but small amount of reward. Bottom: detailed task and input schemes. B: neural network structure. The network consists of excitatory pyramidal cells (Exc) and inhibitory interneurons (Inh). The pyramidal cells are uniformly placed on a continuous ring, and each neuron is labeled by its preferred motion direction (shown as the arrow in pyramidal cells). The excitatory-to-excitatory connections between pyramidal cells are structured as a Gaussian function of the difference in their preferred directions (upper black curve), and the connections from and onto interneurons are broad. C: motion input (centered at 90°) with different motion strengths, the integral of which is identical for all motion strengths. D: normalized input of 2 directional targets (namely TA at 90° and TB at 270°) and a sure target (namely Ts at 180°).
Fig. 2.
Fig. 2.
Neuronal activity of sample trials at 0 motion strength. Spatiotemporal activity pattern of pyramidal cells in trials when the motion strength is 0. A: sample trial where Ts was not presented (stimulus duration is 627 ms). Neural pools centered around the 2 directional targets eventually diverge from each other; that near TA wins the competition, and its activity persists during the delay in the form of a bell-shaped “bump attractor.” B: average firing rate of the neural pools at TA (black line) and TB (red line) of the trial in A. C: 2 sample trials where Ts was presented (stimulus duration is 627 ms). Top: sure target induced a transient response that was suppressed because of feedback inhibition within the circuit, and the neural pool at TA preserves similar activity to that in A; therefore Ts was waived. Bottom: neural pool around Ts fires at a sufficiently high rate that it overcomes the competition with the other neural pools, which in turn is suppressed by feedback inhibition; therefore Ts was selected. Note that in this trial the neural activities of 2 competing bumps are indistinguishable before Ts onset and gradually decay to a low level after Ts onset. D: neural activities at TA (black lines), TB (red lines), and Ts (blue lines) of the trials in C. Dashed curves: the trial where Ts was selected; solid curves: the trial where Ts was waived. Note that the stimulus condition was identical for the 2 sample trials; whether the sure target was chosen or waived was completely determined by network dynamics that fluctuated from trial to trial. E and F: average activities of RA, RB, and Rs across different motion strengths (100 trials for each motion strength), which follow the same conventions as those in B and D. G and H: network dynamics underlying a trial-by-trial variation of choice in a 3D (RA, RB, Rs) decision state space. G: neural activity trajectories of the 2 sampling trials in C from 150 ms after motion onset, the starting (end, respectively) points of which are marked by circles (triangles, respectively); time sequence of the trial selecting Ts (black line, 1–2 steps) follows that network walks around RA = RB before Ts onset (Step 1 in D; black circles) and then converges to Rs after Ts onset (Step 2 in D; black circles); time sequence of the trial waiving Ts (red line, 1–4 steps) follows that network goes toward RA before Ts onset (Step 1 in D; red circles) and then walks along Rs direction after Ts onset (Step 2 in D; red circles) and then converges back to RA again (Step 3–4 in D; red circles). H: neural activity trajectories of the other 18 sampling trials at the same stimulus condition. For the trials waiving Ts, the network first converges to a choice attractor TA (red lines) or TB (green lines) preceding Ts onset; it then moves along the direction parallel to Rs axis because of the presentation of Ts and finally converges back to the initial choice attractor. For the trials choosing Ts (gray lines), the network first walks around the diagonal line RA = RB (Rs ≈ 0) and then converges to the sure attractor Ts after its presentation. Neural dynamics therefore acts as a 3-way competition.
Fig. 3.
Fig. 3.
Behavioral performance. A: model performance (at a fixed stimulus duration of 627 ms). Left: probability of choosing Ts (Psure) decreases as a sigmoid function of the motion strength. Right: accuracy in trials where Ts is not shown (Pcorrect) increases as a sigmoid function of the motion strength (dashed black curve), and it is improved in trials when Ts was shown but waived (solid black curve). B: at different stimulus durations, Psure decreases with motion strength and stimulus duration; Pcorrect is higher in trials where Ts was shown but waived (solid lines, filled circles) than that where Ts was not shown (dashed lines, open circles). C: behavioral data from Kiani and Shadlen (2009) task using awake monkeys. Comparing B with C, model reproduces salient behavioral observations from the monkey experiment. Experimental data adapted with permission from Kiani and Shadlen (2009).
Fig. 4.
Fig. 4.
Differential activity |RARB| determines whether Ts is waived. A and B: single-trial dynamics of the network in the decision state space, where the population firing rates RA and RB are plotted against each other (the starting point of each network trajectory is marked as a red circle, and ending point is marked as an open circle). Gray: trials when Ts was waived; black: trials when Ts was selected. The dynamical trajectories are shown from 100 ms after motion onset to its offset (left), then to Ts input onset (right), at different motion strengths (A: 3.2%; B: 12.8%; stimulus duration: 627 ms). At the onset of motion stimulus, both RA and RB are high (∼90 Hz), near the diagonal line, because of the presentation of directional targets. The population dynamics first decays along the diagonal line, induced by a suppression of target inputs after motion onset. In trials when Ts was waived, the network trajectory converges to 1 of 2 target attractors (where RA is high and RB is low, or vice versa), whereas, in trials when Ts was selected, the population dynamics continues to wander randomly around the diagonal line. The absolute value of differential activity at Ts onset therefore determines whether Ts is waived. C and D: distribution of RARB at Ts onset is a function of the motion strength (C: 3.2%; D: 12.8%) and stimulus duration (presented in each column), where the percentage of trials around 0 decreases with the motion strength and stimulus duration. This explains why Psure decreases with the motion strength and stimulus duration.
Fig. 5.
Fig. 5.
Onset time of Ts determines the probability of choosing Ts but has little impact on accuracy. A: at a fixed motion strength and stimulus duration (12.8% and 627 ms, respectively), RARB continues to change after motion offset (time presented in each column is relative to motion input offset time) and is settled down only until the late phase of delay (>1,200 ms in simulation). B and C: Psure decreases as a function of Ts input onset time (575 ms: blue; 750 ms: green; 925 ms: red), while Pcorrect remains unaffected. D: probability of choosing TA (at the end of the delay) depends on the differential activity, |RARB|, at Ts input onset time (filled circles: simulation data from B and C where coh = 12.8% and motion direction toward TA; dashed line: logistic function fit). When |RARB| is large, the sign of RARB determines the choice at Ts onset, i.e., positive for TA and negative for TB. If |RARB| is small (RARB from −5 Hz to 5 Hz), the probability of choosing TA increases with RARB. Data in this figure are composed of those at all Ts onset times.
Fig. 6.
Fig. 6.
The probability of waiving Ts reflects choice confidence. A: confidence is defined as the probability of waiving Ts at each |RARB|, i.e. the differential activity of 2 competing bumps at the moment of the sure target input onset, in single trials. A logistic function fit (red dash line) is performed on the data from all computer simulations with Ts presented. B: comparison of probability of correctness and confidence at each |RARB| level in single trials. Both confidence and probability of correctness at each |RARB| level in single trials are computed at decision time of trials without Ts presentation. Probability of correctness increases as a monotonic function of confidence, which implies that confidence in our model would also be a good measurement of the subjective correct rate or log odds of choice. C: confidence assessment at Ts input onset (the duration from motion input offset to the time of confidence estimation is fixed, i.e. 575 ms, top) increases with the motion strength and stimulus duration. D: confidence assessment at an identical time after the motion input onset (the duration from motion onset to confidence estimation is fixed, i.e. 1,550 ms, top) saturates after a short period of stimulus duration. In this case, early evidence plays a dominant role in confidence estimation.
Fig. 7.
Fig. 7.
Low confidence results in changes of mind to Ts. A: trials with low confidence exhibit changes of mind to Ts (motion strength: 6.4%; stimulus duration: 243 ms). Top: sample trials with low confidence, small |RARB|. Even though the network has reached 1 of the 2 choice attractors [left: TA (black lines); right: TB (red lines)], upon the presentation of Ts, the neural pool selective for Ts takes over (blue lines), so there are changes of mind. Bottom: sample trials with high confidence, large |RARB|. No changes of mind take place. Choice confidence, cc, for each trial is estimated at the time of Ts input onset and shown at the top of each panel. B, left: across trials (averaged over different stimulus durations), the probability of shifting to Ts decreases with the motion strength. Right: in error (correct) trials, this probability increases (decreases, respectively) with the motion strength.
Fig. 8.
Fig. 8.
Effect of Ts input strength on the behavioral performance. In this simulation, the motion strength is fixed at coh = 12.8%, and Ts input strength at I4 = 240 pA (green circles and line) is the same as those used in Figures 2–7. A: Psure increases as a function of Ts input strength. Ts is usually waived (chosen, respectively), when Ts input strength is weak (strong, respectively). B: correct rate in the trials, where Ts is waived, increases as a function of Ts input strength. C: choice confidence is identical at the moment of Ts input onset (which increases as a function of stimulus duration). For a range of Ts input strength (216 pA < I4 < 264 pA), Psure decreases as a linear function of choice confidence.
Fig. 9.
Fig. 9.
Choice confidence in a reaction time (RT) task. A: RT discrimination task with confidence rating. In task, a subject can indicate its choice at any time after the motion onset simultaneously with a direct report of confidence. B: psychometric curves. C: chronometric curves. Pcorrect increases while RT decreases with the motion strength. DF: confidence reported as a post hoc feature of decision. D: choice confidence increases with motion strength (see also the result in Fig. 5, Beck et al. 2008). E: confidence decreases as an inverse function of RT [cc = a/(t − b) + c; a = 91.24 ms, b = 1369.35 ms, c = 1.089 are parameters to fit, R2 = 0.998; black line]. F: confidence increases (decreases) as a function of the motion strength in correct (error, respectively) trials. B and D imply that choice confidence increases with Pcorrect. We found that, for difficult trials, the simulation exhibits overconfidence (confidence estimation is greater than correct rate), whereas, for easy trials, it exhibits underconfidence (confidence estimation is lower than correct rate). G and H: variation of under-/overconfidence score with the increase of confidence in the fixed-duration (FD) task with a delay of 627 ms and RT task, respectively. The network behaves with overconfidence (above 0) in very difficult trials (at 0, 3.2% and 6.4% motion strengths for FD task; at 0 and 1.6% motion strengths for RT task), but with underconfidence (below 0) in easy and moderately difficult trials.

References

    1. Albantakis L, Deco G. Changes of mind in an attractor network of decision-making. PLoS Comput Biol 7: e1002086, 2011. - PMC - PubMed
    1. Barthelmé S, Mamassian P. Flexible mechanisms underlie the evaluation of visual confidence. Proc Natl Acad Sci USA 107: 20834–20839, 2010. - PMC - PubMed
    1. Beck JM, Ma WJ, Kiani R, Hanks T, Churchland AK, Roitman J, Shadlen MN, Latham PE, Pouget A. Probabilistic population codes for Bayesian decision making. Neuron 60: 1142–1152, 2008. - PMC - PubMed
    1. Brunton BW, Botvinick MM, Brody CD. Rats and humans can optimally accumulate evidence for decision-making. Science 340: 95–98, 2013. - PubMed
    1. Churchland AK, Kiani R, Shadlen MN. Decision-making with multiple alternatives. Nat Neurosci 11: 693–702, 2008. - PMC - PubMed

Publication types

LinkOut - more resources