Machine learning prediction models for multidrug-resistant organism infections in ICU ventilator-associated pneumonia patients: Analysis using the MIMIC-IV database
- PMID: 40154202
- DOI: 10.1016/j.compbiomed.2025.110028
Machine learning prediction models for multidrug-resistant organism infections in ICU ventilator-associated pneumonia patients: Analysis using the MIMIC-IV database
Abstract
Objective: This study aims to construct and compare four machine learning models using the MIMIC-IV database to identify high-risk factors for multidrug-resistant organism (MDRO) infection in Ventilator-associated pneumonia (VAP) patients.
Methods: The study included 972 VAP patients from the MIMIC-IV database. Data encompassing demographic information, vital signs, laboratory results, and other relevant variables were collected. The class imbalance issue was addressed using the Synthetic Minority Over-sampling Technique (SMOTE). The dataset was randomly split into training and testing sets (8:2). LASSO regression and feature importance scores were used for feature selection. Clinical prediction models were built using logistic regression, XGBoost, random forest and gradient boosting machine. The performance of the models was evaluated through receiver operating characteristic(ROC) curve analysis.Model calibration was assessed using calibration curves and Brier scores. The effectiveness was evaluated through Decision Curve Analysis (DCA). SHAP was utilized for model interpretation.
Results: Among 972 patients, 824 were non-MDROs-VAP and 128 were MDROs-VAP. Comparative analysis revealed statistically significant differences in various clinical parameters. XGBoost exhibited the best predictive performance, incorporating 20 features with an AUC of 0.831 (95 % CI: 0.785-0.877) on the test set. Calibration curves demonstrated robust consistency, corroborated by Decision Curve Analysis (DCA) affirming the clinical utility. SHAP analysis identified the most important features: red cell distribution width, duration of mechanical ventilation, anion gap, basophil percentage, and neutrophil percentage.
Conclusion: This study established and compared four machine learning models for MDROs infections in VAP patients. XGBoost was identified as the optimal predictor, and SHAP values provided insights into 20 independent risk factors, confirming its excellent predictive value.
Implications for clinical practice: VAP is a common infection in ICU patients with a heightened risk of MDRO and increased mortality. The recognition of high bias in existing models calls for future research to employ rigorous methodologies and robust data sources, aiming to develop and validate more accurate and clinically applicable predictive models for MDROs infections in VAP patients.
Keywords: MIMIC-Ⅳdatabase; Machine learning; Multidrug-resistant organisms; Prediction model; Ventilator-associated pneumonia.
Copyright © 2025 Elsevier Ltd. All rights reserved.
Conflict of interest statement
Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
MeSH terms
LinkOut - more resources
Full Text Sources
Miscellaneous
