Development and External Validation of a Detection Model to Retrospectively Identify Patients With Acute Respiratory Distress Syndrome
- PMID: 40197621
- PMCID: PMC12919718
- DOI: 10.1097/CCM.0000000000006662
Development and External Validation of a Detection Model to Retrospectively Identify Patients With Acute Respiratory Distress Syndrome
Abstract
Objective: The aim of this study was to develop and externally validate a machine-learning model that retrospectively identifies patients with acute respiratory distress syndrome (acute respiratory distress syndrome [ARDS]) using electronic health record (EHR) data.
Design: In this retrospective cohort study, ARDS was identified via physician-adjudication in three cohorts of patients with hypoxemic respiratory failure (training, internal validation, and external validation). Machine-learning models were trained to classify ARDS using vital signs, respiratory support, laboratory data, medications, chest radiology reports, and clinical notes. The best-performing models were assessed and internally and externally validated using the area under receiver-operating curve (AUROC), area under precision-recall curve, integrated calibration index (ICI), sensitivity, specificity, positive predictive value (PPV), and ARDS timing.
Patients: Patients with hypoxemic respiratory failure undergoing mechanical ventilation within two distinct health systems.
Interventions: None.
Measurements and main results: There were 1,845 patients in the training cohort, 556 in the internal validation cohort, and 199 in the external validation cohort. ARDS prevalence was 19%, 17%, and 31%, respectively. Regularized logistic regression models analyzing structured data (EHR model) and structured data and radiology reports (EHR-radiology model) had the best performance. During internal and external validation, the EHR-radiology model had AUROC of 0.91 (95% CI, 0.88-0.93) and 0.88 (95% CI, 0.87-0.93), respectively. Externally, the ICI was 0.13 (95% CI, 0.08-0.18). At a specified model threshold, sensitivity and specificity were 80% (95% CI, 75%-98%), PPV was 64% (95% CI, 58%-71%), and the model identified patients with a median of 2.2 hours (interquartile range 0.2-18.6) after meeting Berlin ARDS criteria.
Conclusions: Machine-learning models analyzing EHR data can retrospectively identify patients with ARDS across different institutions.
Keywords: ARDS; acute lung injury; hypoxemic respiratory failure; machine learning; mechanical ventilation.
Copyright © 2025 by the Society of Critical Care Medicine and Wolters Kluwer Health, Inc. All Rights Reserved.
Conflict of interest statement
Michael Sjoding previously received royalties for a software technology that processes chest radiograph images to detect acute respiratory distress syndrome. This software was previously licensed to AirStrip Technologies, Inc. Meeta Kerlin is a member of a data safety monitoring board unrelated to this article. Dr. Levy received funding from the National Institutes of Health (NIH). Drs. Levy, Ginestra, Kohn, Patel, and Sjoding received support for article research from the NIH. Dr. Ginestra’s institution received funding from the National Heart, Lung, and Blood Institute. Dr. McSparron received funding from UpToDate and Springer. Drs. Kerlin’s and Sjoding’s institutions received funding from the NIH. Dr. Sjoding received funding from Airstrip. The remaining authors have disclosed that they do not have any potential conflicts of interest.
Figures
References
-
- Ranieri VM, Rubenfeld GD, Thompson BT, et al. ; ARDS Definition Task Force: Acute respiratory distress syndrome: The Berlin definition. JAMA 2012; 307:2526–2533 - PubMed
-
- Bellani G, Laffey JG, Pham T, et al. ; LUNG SAFE Investigators: Epidemiology, patterns of care, and mortality for patients with acute respiratory distress syndrome in intensive care units in 50 countries. JAMA 2016; 315:788–800 - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
