Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Oct 7.
doi: 10.1002/jhm.70194. Online ahead of print.

Large language model-based identification of venous thromboembolism diagnostic delays

Affiliations

Large language model-based identification of venous thromboembolism diagnostic delays

Verity Schaye et al. J Hosp Med. .

Abstract

Background: Delayed diagnosis of venous thromboembolism (VTE) is prevalent among hospitalized patients, yet case identification is challenging and feedback limited.

Objective: To develop a large language model (LLM)-based electronic-trigger to identify VTE diagnostic delays.

Methods: All admissions to internal medicine (IM) residents at NYU Langone Health between January 2022 and December 2023 (n = 20,843) were included. Using an open-source LLM, prompts were validated to detect (1) residents considering VTE in admission notes and (2) VTE confirmation in five types of imaging reports (n = 100 for each prompt validation set). The validated prompts were applied to determine discordance between admission note differential omitting VTE and imaging report confirming VTE. Two hospitalists reviewed discordant cases using a validated tool to identify diagnostic delays. Hospitalizations were labeled as diagnostic delays, in-hospital complication, or false-positive. Based on in-hospital complication and false-positive patterns, exclusion criteria were implemented. Positive predictive value (PPV) and negative predictive value (NPV) were calculated.

Results: The LLM prompts correctly classified admission notes and VTE imaging studies with high accuracy (range 98%-100%, n = 699 VTE cases identified). Of the 137 diagnostic delays the LLM-based electronic-trigger identified, 31 were true-positives, 60 in-hospital complications, and 46 false-positives. 4.4% of all VTE hospitalizations had a diagnostic delay. With the exclusion criteria, the PPV was 48% (95% confidence interval [CI], 35%-62%) and NPV was 95% (95% CI, 87%-98%).

Conclusions: We developed the first LLM-based electronic-trigger to identify VTE diagnostic delays, with higher performance than existing non-LLM electronic-triggers. LLM-based approaches can facilitate diagnostic performance feedback and are scalable to other conditions and institutions.

PubMed Disclaimer

References

REFERENCES

    1. National Academies of Sciences, Engineering, and Medicine. Improving Diagnosis in Health Care. National Academies Press; 2015. doi:10.17226/21794
    1. Newman‐Toker DE, Nassery N, Schaffer AC, et al. Burden of serious harms from diagnostic error in the USA. BMJ Qual Saf. 2024;33(2):109‐120. doi:10.1136/bmjqs-2021-014130
    1. Auerbach AD, Lee TM, Hubbard CC, et al. Diagnostic errors in hospitalized adults who died or were transferred to intensive care. JAMA Intern Med. 2024;184(2):164‐173. doi:10.1001/jamainternmed.2023.7347
    1. Gunderson CG, Bilan VP, Holleck JL, et al. Prevalence of harmful diagnostic errors in hospitalised adults: a systematic review and meta‐analysis. BMJ Qual Saf. 2020;29(12):1008‐1018. doi:10.1136/bmjqs-2019-010822
    1. Graber ML, Castro GM, Danforth M, et al. Root cause analysis of cases involving diagnosis. Diagnosis. 2024;11(4):353‐368. doi:10.1515/dx-2024-0102

LinkOut - more resources