Large language model-based identification of venous thromboembolism diagnostic delays

Verity Schaye^{1

2}, Daniel J Sartori¹, Lexi Signoriello², Kiran Malhotra^{3

4}, Benedict Guzman⁵, Bijal Rajput¹, Ilan Reinstein², Jesse Burk-Rafel^{1

2}

Affiliations

¹ Department of Medicine, NYU Grossman School of Medicine, New York, New York, USA.
² Institute for Innovations in Medical Education, NYU Grossman School of Medicine, New York, New York, USA.
³ Clinical Informatics Fellow, NYU Grossman School of Medicine, New York, New York, USA.
⁴ Department of Ophthalmology, NYU Grossman School of Medicine, New York, New York, USA.
⁵ Division of Applied AI Technologies, NYU Langone Health, New York, New York, USA.

PMID: 41058083
DOI: 10.1002/jhm.70194

Large language model-based identification of venous thromboembolism diagnostic delays

Verity Schaye et al. J Hosp Med. 2025.

. 2025 Oct 7.

doi: 10.1002/jhm.70194. Online ahead of print.

Authors

Verity Schaye^{1

2}, Daniel J Sartori¹, Lexi Signoriello², Kiran Malhotra^{3

4}, Benedict Guzman⁵, Bijal Rajput¹, Ilan Reinstein², Jesse Burk-Rafel^{1

2}

Affiliations

¹ Department of Medicine, NYU Grossman School of Medicine, New York, New York, USA.
² Institute for Innovations in Medical Education, NYU Grossman School of Medicine, New York, New York, USA.
³ Clinical Informatics Fellow, NYU Grossman School of Medicine, New York, New York, USA.
⁴ Department of Ophthalmology, NYU Grossman School of Medicine, New York, New York, USA.
⁵ Division of Applied AI Technologies, NYU Langone Health, New York, New York, USA.

PMID: 41058083
DOI: 10.1002/jhm.70194

Abstract

Background: Delayed diagnosis of venous thromboembolism (VTE) is prevalent among hospitalized patients, yet case identification is challenging and feedback limited.

Objective: To develop a large language model (LLM)-based electronic-trigger to identify VTE diagnostic delays.

Methods: All admissions to internal medicine (IM) residents at NYU Langone Health between January 2022 and December 2023 (n = 20,843) were included. Using an open-source LLM, prompts were validated to detect (1) residents considering VTE in admission notes and (2) VTE confirmation in five types of imaging reports (n = 100 for each prompt validation set). The validated prompts were applied to determine discordance between admission note differential omitting VTE and imaging report confirming VTE. Two hospitalists reviewed discordant cases using a validated tool to identify diagnostic delays. Hospitalizations were labeled as diagnostic delays, in-hospital complication, or false-positive. Based on in-hospital complication and false-positive patterns, exclusion criteria were implemented. Positive predictive value (PPV) and negative predictive value (NPV) were calculated.

Results: The LLM prompts correctly classified admission notes and VTE imaging studies with high accuracy (range 98%-100%, n = 699 VTE cases identified). Of the 137 diagnostic delays the LLM-based electronic-trigger identified, 31 were true-positives, 60 in-hospital complications, and 46 false-positives. 4.4% of all VTE hospitalizations had a diagnostic delay. With the exclusion criteria, the PPV was 48% (95% confidence interval [CI], 35%-62%) and NPV was 95% (95% CI, 87%-98%).

Conclusions: We developed the first LLM-based electronic-trigger to identify VTE diagnostic delays, with higher performance than existing non-LLM electronic-triggers. LLM-based approaches can facilitate diagnostic performance feedback and are scalable to other conditions and institutions.

PubMed Disclaimer

References

REFERENCES

1. National Academies of Sciences, Engineering, and Medicine. Improving Diagnosis in Health Care. National Academies Press; 2015. doi:10.17226/21794
1. Newman‐Toker DE, Nassery N, Schaffer AC, et al. Burden of serious harms from diagnostic error in the USA. BMJ Qual Saf. 2024;33(2):109‐120. doi:10.1136/bmjqs-2021-014130
1. Auerbach AD, Lee TM, Hubbard CC, et al. Diagnostic errors in hospitalized adults who died or were transferred to intensive care. JAMA Intern Med. 2024;184(2):164‐173. doi:10.1001/jamainternmed.2023.7347
1. Gunderson CG, Bilan VP, Holleck JL, et al. Prevalence of harmful diagnostic errors in hospitalised adults: a systematic review and meta‐analysis. BMJ Qual Saf. 2020;29(12):1008‐1018. doi:10.1136/bmjqs-2019-010822
1. Graber ML, Castro GM, Danforth M, et al. Root cause analysis of cases involving diagnosis. Diagnosis. 2024;11(4):353‐368. doi:10.1515/dx-2024-0102

Grants and funding

LinkOut - more resources

Full Text Sources
- Wiley

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Large language model-based identification of venous thromboembolism diagnostic delays

Affiliations

Large language model-based identification of venous thromboembolism diagnostic delays

Authors

Affiliations

Abstract

References

REFERENCES

Grants and funding

LinkOut - more resources

Full Text Sources