Large language model-based identification of venous thromboembolism diagnostic delays
- PMID: 41058083
- DOI: 10.1002/jhm.70194
Large language model-based identification of venous thromboembolism diagnostic delays
Abstract
Background: Delayed diagnosis of venous thromboembolism (VTE) is prevalent among hospitalized patients, yet case identification is challenging and feedback limited.
Objective: To develop a large language model (LLM)-based electronic-trigger to identify VTE diagnostic delays.
Methods: All admissions to internal medicine (IM) residents at NYU Langone Health between January 2022 and December 2023 (n = 20,843) were included. Using an open-source LLM, prompts were validated to detect (1) residents considering VTE in admission notes and (2) VTE confirmation in five types of imaging reports (n = 100 for each prompt validation set). The validated prompts were applied to determine discordance between admission note differential omitting VTE and imaging report confirming VTE. Two hospitalists reviewed discordant cases using a validated tool to identify diagnostic delays. Hospitalizations were labeled as diagnostic delays, in-hospital complication, or false-positive. Based on in-hospital complication and false-positive patterns, exclusion criteria were implemented. Positive predictive value (PPV) and negative predictive value (NPV) were calculated.
Results: The LLM prompts correctly classified admission notes and VTE imaging studies with high accuracy (range 98%-100%, n = 699 VTE cases identified). Of the 137 diagnostic delays the LLM-based electronic-trigger identified, 31 were true-positives, 60 in-hospital complications, and 46 false-positives. 4.4% of all VTE hospitalizations had a diagnostic delay. With the exclusion criteria, the PPV was 48% (95% confidence interval [CI], 35%-62%) and NPV was 95% (95% CI, 87%-98%).
Conclusions: We developed the first LLM-based electronic-trigger to identify VTE diagnostic delays, with higher performance than existing non-LLM electronic-triggers. LLM-based approaches can facilitate diagnostic performance feedback and are scalable to other conditions and institutions.
© 2025 Society of Hospital Medicine.
References
REFERENCES
-
- National Academies of Sciences, Engineering, and Medicine. Improving Diagnosis in Health Care. National Academies Press; 2015. doi:10.17226/21794
-
- Newman‐Toker DE, Nassery N, Schaffer AC, et al. Burden of serious harms from diagnostic error in the USA. BMJ Qual Saf. 2024;33(2):109‐120. doi:10.1136/bmjqs-2021-014130
-
- Auerbach AD, Lee TM, Hubbard CC, et al. Diagnostic errors in hospitalized adults who died or were transferred to intensive care. JAMA Intern Med. 2024;184(2):164‐173. doi:10.1001/jamainternmed.2023.7347
-
- Gunderson CG, Bilan VP, Holleck JL, et al. Prevalence of harmful diagnostic errors in hospitalised adults: a systematic review and meta‐analysis. BMJ Qual Saf. 2020;29(12):1008‐1018. doi:10.1136/bmjqs-2019-010822
-
- Graber ML, Castro GM, Danforth M, et al. Root cause analysis of cases involving diagnosis. Diagnosis. 2024;11(4):353‐368. doi:10.1515/dx-2024-0102
Grants and funding
LinkOut - more resources
Full Text Sources
