Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 Jan 17:2025.01.16.25320564.
doi: 10.1101/2025.01.16.25320564.

Large language models outperform traditional structured data-based approaches in identifying immunosuppressed patients

Affiliations

Large language models outperform traditional structured data-based approaches in identifying immunosuppressed patients

Vijeeth Guggilla et al. medRxiv. .

Abstract

Identifying immunosuppressed patients using structured data can be challenging. Large language models effectively extract structured concepts from unstructured clinical text. Here we show that GPT-4o outperforms traditional approaches in identifying immunosuppressive conditions and medication use by processing hospital admission notes. We also demonstrate the extensibility of our approach in an external dataset. Cost-effective models like GPT-4o mini and Llama 3.1 also perform well, but not as well as GPT-4o.

PubMed Disclaimer

Conflict of interest statement

Ethics declarations Competing interests T.L.W. has received research funding from Gilead Sciences to support investigation of the relationship between immunosuppressive conditions and COVID-19 outcomes. Gilead personnel had no involvement in this research. All other authors declare no financial or non-financial competing interests.

Figures

Fig. 1:
Fig. 1:. Flow diagram of study design comparing structured data and LLM performance in identifying immunosuppression followed by external validation of LLM performance.
Structured data and admission notes were extracted for 827 SCRIPT admissions which were adjudicated for immunosuppressive conditions and medications. The performance of structured data algorithms and LLMs in predicting the adjudicated labels was assessed. External validation of using LLMs to predict adjudicated immunosuppression labels was performed in 200 MIMIC-III admission notes.
Fig. 2:
Fig. 2:. Performance of structured data in identifying immunosuppressive conditions and medications.
F1 scores, precision, and recall were computed by comparing structured data predictions to adjudicated labels. A Performance metrics across an increasing minimum number of dates with a diagnosis code required to count as a case. B Confusion matrices for predicting medication use by the presence of a valid medication order in the six months prior to admission.
Fig. 3:
Fig. 3:. Prompts used for immunosuppression attribute extraction and estimation from admission notes.
Prompts for extracting A immunosuppressive conditions and B immunosuppressive medications where the highlighted portion is either a list of A conditions or B medications.
Fig. 4:
Fig. 4:. Performance of GPT-4o in identifying immunosuppressive conditions and medications in the SCRIPT cohort.
F1 scores, precision, and recall were computed by comparing GPT-4o predictions to adjudicated labels. Confusion matrices for GPT-4o predicting A immunosuppressive conditions and B immunosuppressive medication use from admission notes.
Fig. 5:
Fig. 5:. Performance of GPT-4o in identifying immunosuppressive conditions and medications in the MIMIC-III cohort.
F1 scores, precision, and recall were computed by comparing GPT-4o predictions to adjudicated labels. Confusion matrices for GPT-4o predicting A immunosuppressive conditions and B immunosuppressive medication use from admission notes.

References

    1. FastStats. https://www.cdc.gov/nchs/fastats/pneumonia.htm (2024).
    1. Greenberg J. A., Hohmann S. F., Hall J. B., Kress J. P. & David M. Z. Validation of a Method to Identify Immunocompromised Patients with Severe Sepsis in Administrative Databases. Ann. Am. Thorac. Soc. 13, 253–258 (2016). - PMC - PubMed
    1. Pathak J., Kho A. N. & Denny J. C. Electronic health records-driven phenotyping: challenges, recent advances, and perspectives. J. Am. Med. Inform. Assoc. 20, e206–e211 (2013). - PMC - PubMed
    1. Hao T., Huang Z., Liang L., Weng H. & Tang B. Health Natural Language Processing: Methodology Development and Applications. JMIR Med. Inform. 9, e23898 (2021). - PMC - PubMed
    1. Wu H. et al. A survey on clinical natural language processing in the United Kingdom from 2007 to 2022. Npj Digit. Med. 5, 1–15 (2022). - PMC - PubMed

Publication types

LinkOut - more resources