This is a preprint.
Developing and externally validating machine learning models to forecast short-term risk of ventilator-associated pneumonia
- PMID: 41646725
- PMCID: PMC12870606
- DOI: 10.64898/2026.01.28.26344858
Developing and externally validating machine learning models to forecast short-term risk of ventilator-associated pneumonia
Abstract
Purpose: Ventilator-associated pneumonia (VAP) remains one of the most serious hospital-acquired infections in the intensive care unit (ICU), with high morbidity and mortality. Early identification of patients at risk for developing VAP could enable timely diagnostics and intervention. However, current clinical tools are limited in their ability to detect early physiologic signals preceding VAP onset. We aimed to build supervised machine learning models to predict short term onset of VAP.
Methods: We analyzed electronic health record data from a prospective observational cohort of ICU patients, where VAP was adjudicated using a standardized published protocol by a panel of critical care physicians. Clinical features (including vital signs, ventilator settings, laboratory values, and support devices) were extracted for each patient-ICU-day. We explored unsupervised clustering to characterize feature dynamics associated with VAP onset. We built multiple machine learning models across different prediction windows (3, 5, 7 days before VAP). We examined model performance in two external cohorts, MIMIC-IV and secondary analysis of the AMIKINHAL trial. Results were evaluated with discrimination metrics such as AUROC.
Results: The internal cohort included 507 patients with BAL-confirmed diagnoses: 261 developed VAP and 246 did not have VAP. Visualization using clustering identified distinct physiologic states enriched for VAP-labeled days. The best-performing model achieved an AUROC of 0.866 in predicting VAP up to seven days before clinical diagnosis. Temporal model probability trajectories showed rising model confidence in the days leading up to VAP. On external validation in MIMIC-IV, the best model achieved an AUROC of 0.817 for forecasting VAP within five days. There was low feature overlap with the AMIKINHAL trial data, leading to poor model performance. Feature analysis revealed that platelet count, positive end-expiratory pressure (PEEP), ventilator duration, and inflammatory markers were key drivers of model predictions.
Conclusions: Machine learning models trained on routinely collected ICU data with careful labeling can anticipate VAP onset up to a week in advance with strong predictive performance. Model performance generalized to data from an entirely different hospital system despite differences in practice and labeling patterns, but did not perform well when there was poor feature overlap. Future work should focus on real-time prospective evaluation.
Keywords: machine learning; mechanical ventilation; ventilator-associated pneumonia.
Conflict of interest statement
Conflicts of Interest B.D.S. holds United States Patent No. US 10,905,706 B2, Compositions and Methods to Accelerate Resolution of Acute Lung Inflammation, and serves on the Scientific Advisory Board of Zoe Biosciences. S.E. reports relationships with Aerogen Ltd., Fisher & Paykel, and JIB. All other authors declare no competing interests.
Figures
References
-
- Kohbodi G. A., et al. Venkat Rajasurya, and Asif Noor. Ventilator-Associated Pneumonia. StatPearls Publishing, Treasure Island (FL), 2018. URL https://www.ncbi.nlm.nih.gov/books/NBK507711/. StatPearls [Internet].
-
- Howroyd Fiona, Chacko Cyril, Andrew MacDuff Nandan Gautam, Pouchet Brian, Tunnicliffe Bill, Weblin Jonathan, Fang Gao-Smith Zubair Ahmed, Niharika A Duggal, et al. Ventilator-associated pneumonia: pathobiological heterogeneity and diagnostic challenges. Nature communications, 15(1):6447, 2024.
-
- Zilberberg Marya D and Shorr Andrew F. et al. Ventilator-associated pneumonia: the clinical pulmonary infection score as a surrogate for diagnostics and outcome. Clinical infectious diseases, 51(Supplement 1):S131–S135, 2010. - PubMed
Publication types
Grants and funding
- R01 HL153122/HL/NHLBI NIH HHS/United States
- U19 AI181102/AI/NIAID NIH HHS/United States
- R21 AG075423/AG/NIA NIH HHS/United States
- K23 HL169815/HL/NHLBI NIH HHS/United States
- P01 AG049665/AG/NIA NIH HHS/United States
- P01 HL071643/HL/NHLBI NIH HHS/United States
- U19 AI135964/AI/NIAID NIH HHS/United States
- R01 HL153312/HL/NHLBI NIH HHS/United States
- P01 HL154998/HL/NHLBI NIH HHS/United States
- U01 TR003528/TR/NCATS NIH HHS/United States
- R01 HL147575/HL/NHLBI NIH HHS/United States
- R01 HL149883/HL/NHLBI NIH HHS/United States
- R00 AG068544/AG/NIA NIH HHS/United States
- I01 CX001777/CX/CSRD VA/United States
- R01 ES034350/ES/NIEHS NIH HHS/United States
- R01 HL158139/HL/NHLBI NIH HHS/United States
- R21 HD107571/HD/NICHD NIH HHS/United States
- R01 AI158530/AI/NIAID NIH HHS/United States
- R01 HL154686/HL/NHLBI NIH HHS/United States
LinkOut - more resources
Full Text Sources