. 2025 Dec 1;32(12):1888-1898.

doi: 10.1093/jamia/ocaf141.

Large language models accurately identify immunosuppression in intensive care unit patients

Vijeeth Guggilla¹, Mengjia Kang², Melissa J Bak³, Steven D Tran¹, Anna Pawlowski⁴, Prasanth Nannapaneni⁴, Luke V Rasmussen⁵, Daniel Schneider⁴, Helen K Donnelly², Ankit Agrawal⁶, David Liebovitz³, Alexander V Misharin^{2

7}, G R Scott Budinger^{2

7}, Richard G Wunderink^{2

7}, Theresa L Walunas^{1

3}, Catherine A Gao²; NU SCRIPT Study Investigators

Collaborators, Affiliations

Collaborators

NU SCRIPT Study Investigators:
Alan R Hauser, Alec Peltekian, Alexander V Misharin, Alexis Rose Wolfe, Alison L Szabo, Alok Choudhary, Amy Ludwig, Anahid Amani Moghadam, Anjana V Yeldandi, Ankit Agrawal, Ankit Bharat, Anna E Pawlowski, Anthony M Joudi, Arjun Prakash Tambe, Ashley J Smith-Nunez, Benjamin D Singer, Benjamin J Ulrich, Betty Tran, Cara J Gottardi, Catherine A Gao, Chao Qi, Chiagozie O Pickens, Clara J Schroedl, Daniel Meza, Daniel Schneider, Dulce Sarai Garcia, Egon A Ozer, Elen Gusman, Elisheva D Shanes, Emily M Leibenguth, Emily M Olson, Erica Marie Hartmann, Erin A Korth, Estefani Diaz, Estefany R Guzman, Francisco J Martinez, Gabrielle Matias, G R Scott Budinger, Helen K Donnelly, Hiam Abdala-Valencia, Jack T Sumner, Jacob I Sznajder, Jacqueline M Kruser, Jakub Glowala, James M Walter, Jamie H Rowell, Jason M Arnold, John Coleman, Jon W Lomasney, Joseph Isaac Bailey, Judd Hultquist, Justin A Fiala, Justin Starren, Karen M Ridge, Karolina J Senkow, Kathryn A Helmin, Khalilah L Gates, Lacy Simmons, Lesley Pinzon, Lindsey D Gradone, Lisa F Wolfe, Lucy Luo, Luisa Morales-Nebreda, Luke V Rasmussen, Manu Jain, Marc A Sala, Maxwell Schleck, Melissa H Ross, Melissa Querrey, Mengjia Kang, Michael J Cuttica, Michelle Hinsch Prickett, Nandita Nadig, Nathaniel Rhodes, Navdeep S Chandel, Nikolay S Markov, Peter H S Sporn, Prasanth Nannapaneni, Qianli Liu, Rachel B Kadar, Rachel L Medernach, Ramon Lorenzo-Redondo, Ravi Kalhan, Rebecca K Clepp, Richard G Wunderink, Richard I Morimoto, Rogan A Grant, Ruben J Mylvaganam, Samuel Fenske, Scott A Laurenzo, Sean Smith, Seung Hye Han, Sophia Nozick, Srinivas Panchamukhi, Stephanie C Eisenbarth, Suchitra Swaminathan, Susan R Russell, Taylor A Poor, Thaddeus R Cybulski, Theresa A Lombardo, Theresa L Walunas, Thomas Bolig, Thomas Stoeger, Tien Doan, Timothy Rowe, Vijeeth Guggilla, Wan-Ting Liao, Yuan Luo, Yuliana Sokolenko, Zhan Yu, Ziyan Lu

Affiliations

¹ Institute for Artificial Intelligence in Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, United States.
² Division of Pulmonary and Critical Care, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, United States.
³ Division of General Internal Medicine, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, United States.
⁴ Northwestern Medicine Enterprise Data Warehouse, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, United States.
⁵ Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, United States.
⁶ Department of Electrical and Computer Engineering, Northwestern University McCormick School of Engineering, Evanston, IL 60208, United States.
⁷ Simpson Querrey Lung Institute for Translational Science (SQLIFTS), Northwestern University Feinberg School of Medicine, Chicago, IL 60611, United States.

PMID: 40977378
PMCID: PMC12490808
DOI: 10.1093/jamia/ocaf141

Large language models accurately identify immunosuppression in intensive care unit patients

Vijeeth Guggilla et al. J Am Med Inform Assoc. 2025.

. 2025 Dec 1;32(12):1888-1898.

doi: 10.1093/jamia/ocaf141.

Authors

Collaborators

NU SCRIPT Study Investigators:
Alan R Hauser, Alec Peltekian, Alexander V Misharin, Alexis Rose Wolfe, Alison L Szabo, Alok Choudhary, Amy Ludwig, Anahid Amani Moghadam, Anjana V Yeldandi, Ankit Agrawal, Ankit Bharat, Anna E Pawlowski, Anthony M Joudi, Arjun Prakash Tambe, Ashley J Smith-Nunez, Benjamin D Singer, Benjamin J Ulrich, Betty Tran, Cara J Gottardi, Catherine A Gao, Chao Qi, Chiagozie O Pickens, Clara J Schroedl, Daniel Meza, Daniel Schneider, Dulce Sarai Garcia, Egon A Ozer, Elen Gusman, Elisheva D Shanes, Emily M Leibenguth, Emily M Olson, Erica Marie Hartmann, Erin A Korth, Estefani Diaz, Estefany R Guzman, Francisco J Martinez, Gabrielle Matias, G R Scott Budinger, Helen K Donnelly, Hiam Abdala-Valencia, Jack T Sumner, Jacob I Sznajder, Jacqueline M Kruser, Jakub Glowala, James M Walter, Jamie H Rowell, Jason M Arnold, John Coleman, Jon W Lomasney, Joseph Isaac Bailey, Judd Hultquist, Justin A Fiala, Justin Starren, Karen M Ridge, Karolina J Senkow, Kathryn A Helmin, Khalilah L Gates, Lacy Simmons, Lesley Pinzon, Lindsey D Gradone, Lisa F Wolfe, Lucy Luo, Luisa Morales-Nebreda, Luke V Rasmussen, Manu Jain, Marc A Sala, Maxwell Schleck, Melissa H Ross, Melissa Querrey, Mengjia Kang, Michael J Cuttica, Michelle Hinsch Prickett, Nandita Nadig, Nathaniel Rhodes, Navdeep S Chandel, Nikolay S Markov, Peter H S Sporn, Prasanth Nannapaneni, Qianli Liu, Rachel B Kadar, Rachel L Medernach, Ramon Lorenzo-Redondo, Ravi Kalhan, Rebecca K Clepp, Richard G Wunderink, Richard I Morimoto, Rogan A Grant, Ruben J Mylvaganam, Samuel Fenske, Scott A Laurenzo, Sean Smith, Seung Hye Han, Sophia Nozick, Srinivas Panchamukhi, Stephanie C Eisenbarth, Suchitra Swaminathan, Susan R Russell, Taylor A Poor, Thaddeus R Cybulski, Theresa A Lombardo, Theresa L Walunas, Thomas Bolig, Thomas Stoeger, Tien Doan, Timothy Rowe, Vijeeth Guggilla, Wan-Ting Liao, Yuan Luo, Yuliana Sokolenko, Zhan Yu, Ziyan Lu

Affiliations

¹ Institute for Artificial Intelligence in Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, United States.
² Division of Pulmonary and Critical Care, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, United States.
³ Division of General Internal Medicine, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, United States.
⁴ Northwestern Medicine Enterprise Data Warehouse, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, United States.
⁵ Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, United States.
⁶ Department of Electrical and Computer Engineering, Northwestern University McCormick School of Engineering, Evanston, IL 60208, United States.
⁷ Simpson Querrey Lung Institute for Translational Science (SQLIFTS), Northwestern University Feinberg School of Medicine, Chicago, IL 60611, United States.

PMID: 40977378
PMCID: PMC12490808
DOI: 10.1093/jamia/ocaf141

Abstract

Objective: Rule-based structured data algorithms and natural language processing (NLP) approaches applied to unstructured clinical notes have limited accuracy and poor generalizability for identifying immunosuppression. Large language models (LLMs) may effectively identify patients with heterogenous types of immunosuppression from unstructured clinical notes. We compared the performance of LLMs applied to unstructured notes for identifying patients with immunosuppressive conditions or immunosuppressive medication use against 2 baselines: (1) structured data algorithms using diagnosis codes and medication orders and (2) NLP approaches applied to unstructured notes.

Materials and methods: We used hospital admission notes from a primary cohort of 827 intensive care unit (ICU) patients at Northwestern Memorial Hospital and a validation cohort of 200 ICU patients at Beth Israel Deaconess Medical Center, along with diagnosis codes and medication orders from the primary cohort. We evaluated the performance of structured data algorithms, NLP approaches, and LLMs in identifying 7 immunosuppressive conditions and 6 immunosuppressive medications.

Results: In the primary cohort, structured data algorithms achieved peak F1 scores ranging from 0.30 to 0.97 for identifying immunosuppressive conditions and medications. NLP approaches achieved peak F1 scores ranging from 0 to 1. GPT-4o outperformed or matched structured data algorithms and NLP approaches across all conditions and medications, with F1 scores ranging from 0.51 to 1. GPT-4o also performed impressively in our validation cohort (F1 = 1 for 8/13 variables).

Discussion: LLMs, particularly GPT-4o, outperformed structured data algorithms and NLP approaches in identifying immunosuppressive conditions and medications with robust external validation.

Conclusion: LLMs can be applied for improved cohort identification for research purposes.

Keywords: clinical notes; diagnosis codes; immunosuppression; large language model.

PubMed Disclaimer

Conflict of interest statement

T.L.W. has received research funding from Gilead Sciences to support investigation of the relationship between immunosuppressive conditions and COVID-19 outcomes. Gilead personnel had no involvement in this research. All other authors declare no financial or non-financial competing interests.

Figures

**Figure 1.**
Flow diagram of study design comparing the performance of structured data algorithms, NLP approaches, and LLMs in identifying immunosuppression followed by external validation of LLM performance. Structured data and admission notes were extracted for 827 SCRIPT admissions which were adjudicated for immunosuppressive conditions and medications. The performance of structured data algorithms, NLP approaches, and LLMs in predicting the adjudicated labels was assessed. External validation of using LLMs to predict adjudicated immunosuppression labels was performed in 200 MIMIC-III admission notes.

**Figure 2.**
Performance of structured data algorithm baseline in identifying immunosuppressive conditions and medications in the SCRIPT cohort. F1 scores, F2 scores, precision, and recall were computed by comparing structured data algorithm predictions to adjudicated labels. (A) Performance metrics across an increasing minimum number of dates with a diagnosis code required to count as a case. (B) Confusion matrices for predicting medication use by the presence of a valid medication order in the 6 months prior to admission.

**Figure 3.**
Prompts used for immunosuppression attribute extraction and estimation by LLMs from admission notes. Prompts for extracting (A) immunosuppressive conditions and (B) immunosuppressive medications where the highlighted portion introduces either a list of (A) conditions or (B) medications.

**Figure 4.**
Performance of GPT-4o in identifying immunosuppressive conditions and medications in the SCRIPT cohort. F1 scores, F2 scores, precision, and recall were computed by comparing GPT-4o predictions to adjudicated labels. Confusion matrices for GPT-4o predicting (A) immunosuppressive conditions and (B) immunosuppressive medication use from admission notes.

**Figure 5.**
Performance of GPT-4o in identifying immunosuppressive conditions and medications in the MIMIC-III cohort. F1 scores, F2 scores, precision, and recall were computed by comparing GPT-4o predictions to adjudicated labels. Confusion matrices for GPT-4o predicting (A) immunosuppressive conditions and (B) immunosuppressive medication use from admission notes.

See this image and copyright information in PMC

References

1. FastStats. 2024. Accessed September 10, 2024. https://www.cdc.gov/nchs/fastats/pneumonia.htm
1. Greenberg JA, Hohmann SF, Hall JB, Kress JP, David MZ. Validation of a method to identify immunocompromised patients with severe sepsis in administrative databases. Ann Am Thorac Soc. 2016;13:253-258. - PMC - PubMed
1. Pathak J, Kho AN, Denny JC. Electronic health records-driven phenotyping: challenges, recent advances, and perspectives. J Am Med Inform Assoc. 2013;20:e206-e211. - PMC - PubMed
1. Gupta S, Belouali A, Shah NJ, Atkins MB, Madhavan S. Automated identification of patients with immune-related adverse events from clinical notes using word embedding and machine learning. JCO Clin Cancer Inform 2021;5:541-549. - PMC - PubMed
1. Hao T, Huang Z, Liang L, Weng H, Tang B. Health natural language processing: methodology development and applications. JMIR Med. Inform. 2021;9:e23898. - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Large language models accurately identify immunosuppression in intensive care unit patients

Collaborators

Affiliations

Large language models accurately identify immunosuppression in intensive care unit patients

Authors

Collaborators

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources