Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 24;4(5):pgaf133.
doi: 10.1093/pnasnexus/pgaf133. eCollection 2025 May.

Metacognitive sensitivity: The key to calibrating trust and optimal decision making with AI

Affiliations

Metacognitive sensitivity: The key to calibrating trust and optimal decision making with AI

Doyeon Lee et al. PNAS Nexus. .

Abstract

Knowing when to trust and incorporate the advice from artificially intelligent (AI) systems is of increasing importance in the modern world. Research indicates that when AI provides high confidence ratings, human users often correspondingly increase their trust in such judgments, but these increases in trust can occur even when AI fails to provide accurate information on a given task. In this piece, we argue that measures of metacognitive sensitivity provided by AI systems will likely play a critical role in (1) helping individuals to calibrate their level of trust in these systems and (2) optimally incorporating advice from AI into human-AI hybrid decision making. We draw upon a seminal finding in the perceptual decision-making literature that demonstrates the importance of metacognitive ratings for optimal joint decisions and outline a framework to test how different types of information provided by AI systems can guide decision making.

Keywords: artificial intelligence; joint decision making; metacognitive sensitivity; optimal decisions; trust calibration.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Possible types of reports from AI about its decisions. (A) Type 1 reports about behaviors. One possibility is that AI may only report the outcome of a specific decision it has chosen to make, or how frequently it is correct for a particular type of decision. In the example shown here, the AI agent determines which stimulus contains an oriented Gabor and reports its overall accuracy across trials on this task. (B) Subjective (e.g. confidence) reports. Potentially, AI could report how confident it is in a specific decision, or some other type 2 subjective rating. Depending on the algorithm used, this report could be based on something akin to distance from a decision or classification boundary using signal detection theory, support vector machines, or some other algorithm. (C) Reports of metacognitive sensitivity, schematic from Fleming and Lau (50). AI systems may also be able to summarize their metacognitive sensitivity, or how effectively their confidence judgments distinguish between correct and incorrect judgments over long-run averages. In this sense, observers could gain insights into the degree to which they could use/trust the metacognitive or subjective ratings offered by a system on a given trial. (D) Finally, AI models could also try to introspect on their decision-making process to identify why or how they reached specific type 1 or type 2 decisions. Critically, these insights could be about the mechanisms or functions that led to specific type 1 or type 2 decisions. Interestingly, these types of introspections can sometimes lead to increased trust but not to increased accuracy (49).
Fig. 2.
Fig. 2.
(A) Task paradigm from Bahrami et al. (29). On each trial, 2 human observers viewed 2 displays and had to judge which contained a Gabor with increased contrast. In the first experiment, individual decisions were shown on the screen, and if the observers disagreed, they were allowed to communicate to reach a joint decision. Feedback was shown for each observer's judgment (blue, yellow) and the joint decision (white). (B) Four models were fit to the data from experiment 1 (squares; each square is one pair), and the collective benefit (sdyad/smax) was plotted on the y-axis. Results showed that a model that assumed confidence accurately reflected the probability of being correct (weighted confidence sharing [WCS] model) and a model that assumed that mean and standard deviations of sensory responses are shared (direct signal sharing model [DSS]) fit the data equally well. (C) In a second experiment, extra noise was added to the Gabor patches of one, both, or neither participant's displays. Results showed that a model that incorporated confidence judgments (WCS) provided the best account of the data. BF, behavior and feedback; CF, coin flip.

Similar articles

References

    1. Zhao WX, et al. 2023. A survey of large language models. arXiv 18223. 10.48550/arXiv.2303.18223, preprint: not peer reviewed. - DOI
    1. Wei J, et al. 2022. Emergent abilities of large language models. arXiv 07682. 10.48550/arXiv.2206.07682, preprint: not peer reviewed. - DOI
    1. Kasneci E, et al. 2023. ChatGPT for good? On opportunities and challenges of large language models for education. Learn Individ Differ. 103:102274.
    1. Alowais SA, et al. 2023. Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC Med Educ. 23(1):689. 10.1186/s12909-023-04698-z - DOI - PMC - PubMed
    1. Sherani AMK, Khan M, Qayyum MU, Hussain HK. 2024. Synergizing AI and healthcare: pioneering advances in cancer medicine for personalized treatment. Int J Multidiscip Sci Arts. 3(2):270–277.

LinkOut - more resources