Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Mar;32(3):327-339.
doi: 10.1111/acem.15066. Epub 2024 Dec 15.

Leveraging artificial intelligence to reduce diagnostic errors in emergency medicine: Challenges, opportunities, and future directions

Affiliations
Review

Leveraging artificial intelligence to reduce diagnostic errors in emergency medicine: Challenges, opportunities, and future directions

R Andrew Taylor et al. Acad Emerg Med. 2025 Mar.

Abstract

Diagnostic errors in health care pose significant risks to patient safety and are disturbingly common. In the emergency department (ED), the chaotic and high-pressure environment increases the likelihood of these errors, as emergency clinicians must make rapid decisions with limited information, often under cognitive overload. Artificial intelligence (AI) offers promising solutions to improve diagnostic errors in three key areas: information gathering, clinical decision support (CDS), and feedback through quality improvement. AI can streamline the information-gathering process by automating data retrieval, reducing cognitive load, and providing clinicians with essential patient details quickly. AI-driven CDS systems enhance diagnostic decision making by offering real-time insights, reducing cognitive biases, and prioritizing differential diagnoses. Furthermore, AI-powered feedback loops can facilitate continuous learning and refinement of diagnostic processes by providing targeted education and outcome feedback to clinicians. By integrating AI into these areas, the potential for reducing diagnostic errors and improving patient safety in the ED is substantial. However, successfully implementing AI in the ED is challenging and complex. Developing, validating, and implementing AI as a safe, human-centered ED tool requires thoughtful design and meticulous attention to ethical and practical considerations. Clinicians and patients must be integrated as key stakeholders across these processes. Ultimately, AI should be seen as a tool that assists clinicians by supporting better, faster decisions and thus enhances patient outcomes.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

FIGURE 1
FIGURE 1
Factors that affect diagnostic error. Both internal and external factors increase diagnostic error in the ED setting. ECs’ internal states may be stressed by physiological challenges like hunger and fatigue, emotional impacts from prior experiences, and intrinsic skills like task switching. The external ED environment is replete with stimuli that can distract and strain clinicians, with ensuing risk of diagnostic error. Factors that appear to be intrinsic to the ED, like inadequate patient data, time and waiting room pressure, high decision frequency, and even noise can all increase the risk of diagnostic error. Similarly patient‐specific and disease‐specific factors make decision making more difficult while systemic organizational dysfunction and pressures can make diagnostic decision making into a hazardous exercise. EC, emergency clinician.
FIGURE 2
FIGURE 2
Integrated dual‐process decision model. Dual‐process theory proposes that two parallel cognitive systems contribute to decisions. System 1 describes rapid heuristic decision making that improves dramatically with expertise and uses pattern matching extensively. System 2 describes deliberative analytic thought that is much slower and that is thought to train and correct System 1 processes. While we now believe that the two systems cooperate, the two processes offer a helpful approach to optimizing AI‐based decision support tools. In this model, AI tools like NLP and LLMs could support information gathering and synthesis (AI1) by summarizing and collating information from across the EHR. AI‐driven tools that leverage LLMs, XAI, and RAG can support accurate working diagnoses (AI2) by providing probability‐weighted differential diagnosis lists and linking clinicians to relevant evidence. After decisions are applied and outcomes are observed, AI‐driven feedback loops (AI3) can help identify, filter, and analyze potential diagnostic errors to enhance quality improvement processes; the resulting outcome feedback and education can support future diagnostic accuracy by educating System 2 analyses and enhancing System 1 expertise. EHR, electronic health record; LLM, large language model; NLP, natural language processing; QI, quality improvement; RAG, retrieval‐augmented generation; XAI, explainable AI.
FIGURE 3
FIGURE 3
Hierarchical AI‐powered “trigger tools” can scale QI analyses. Serial implementation of AI tools in a hierarchical screening process could support efficient, scalable QI processes. Consideration of a case for QI evaluation could arise from human referrals or AI screening algorithms. Since these methods are overly sensitive, and often identify charts that do not contain quality gaps, this initial screen can be filtered through three tiers. First, LLM tools can extract diagnostic labels from text and other data to accurately characterize outcomes as diagnosis codes. Then, cases can be screened for plausibility based on symptom‐disease matching using the SPADE tool. This step would eliminate cases where patients presented repeatedly but where symptoms reported at prior visits were not plausibly linked with the outcome of concern. Finally, AI tools could be used to complete the SaferDx case evaluation instrument. This step would prepare cases for efficient and thorough human QI review. Cases that prompt outcomes feedback could then be funneled into AI‐driven feedback loops to drive education to individual physicians. AI, artificial intelligence; LLM, large language model; QI, quality improvement.

References

    1. National Academies of Sciences, Engineering, and Medicine, Institute of Medicine, Board on Health Care Services, Committee on Diagnostic Error in Health Care . Improving Diagnosis in Health Care. National Academies Press; 2016.
    1. Board on Health Care Services . Committee on Diagnostic Error in Health Care. Improving Diagnosis in Health Care. National Academies Press; 2016. - PubMed
    1. Iyengar SS, Lepper MR. When choice is demotivating: can one desire too much of a good thing? In: Lichtenstein S, Slovic P, eds. The Construction of Preference. Cambridge University Press; 2006:300‐322.
    1. Cook DA, Sherbino J, Durning SJ. Management reasoning: beyond the diagnosis. JAMA. 2018;319(22):2267‐2268. - PubMed
    1. Patel JJ, Bergl PA. Diagnostic vs management reasoning. JAMA. 2018;320(17):1818. - PubMed

Publication types

LinkOut - more resources