Machine Learning-Based Identification of Risk Factors for ICU Mortality in 8902 Critically Ill Patients with Pandemic Viral Infection
- PMID: 40807005
- PMCID: PMC12346979
- DOI: 10.3390/jcm14155383
Machine Learning-Based Identification of Risk Factors for ICU Mortality in 8902 Critically Ill Patients with Pandemic Viral Infection
Abstract
Background/Objectives: The SARS-CoV-2 and influenza A (H1N1)pdm09 pandemics have resulted in high numbers of ICU admissions, with high mortality. Identifying risk factors for ICU mortality at the time of admission can help optimize clinical decision making. However, the risk factors identified may differ, depending on the type of analysis used. Our aim is to compare the risk factors and performance of a linear model (multivariable logistic regression, GLM) with a non-linear model (random forest, RF) in a large national cohort. Methods: A retrospective analysis was performed on a multicenter database including 8902 critically ill patients with influenza A (H1N1)pdm09 or COVID-19 admitted to 184 Spanish ICUs. Demographic, clinical, laboratory, and microbiological data from the first 24 h were used. Prediction models were built using GLM and RF. The performance of the GLM was evaluated by area under the ROC curve (AUC), precision, sensitivity, and specificity, while the RF by out-of-bag (OOB) error and accuracy. In addition, in the RF, the im-portance of the variables in terms of accuracy reduction (AR) and Gini index reduction (GI) was determined. Results: Overall mortality in the ICU was 25.8%. Model performance was similar, with AUC = 76% for GLM, and AUC = 75.6% for RF. GLM identified 17 independent risk factors, while RF identified 19 for AR and 23 for GI. Thirteen variables were found to be important in both models. Laboratory variables such as procalcitonin, white blood cells, lactate, or D-dimer levels were not significant in GLM but were significant in RF. On the contrary, acute kidney injury and the presence of Acinetobacter spp. were important variables in the GLM but not in the RF. Conclusions: Although the performance of linear and non-linear models was similar, different risk factors were determined, depending on the model used. This alerts clinicians to the limitations and usefulness of studies limited to a single type of model.
Keywords: ICU mortality; generalized linear model; mortality risk factors; pandemic viruses; random forest.
Conflict of interest statement
The authors declare no conflicts of interest.
Figures




References
-
- Centers for Disease Control and Prevention (CDC) COVID-19 Case Surveillance Public Use Data. [(accessed on 24 April 2025)];2021 Available online: https://data.cdc.gov.
-
- Simonsen L., Spreeuwenberg P., Lustig R., Taylor R.J., Fleming D.M., Kroneman M., Van Kerkhove M.D., Mounts A.W., Paget W.J., the GLaMOR Collaborating Teams Global Mortality Estimates for the 2009 Influenza Pandemic from the GLaMOR Project: A Modeling Study. PLoS Med. 2013;10:e1001558. doi: 10.1371/journal.pmed.1001558. - DOI - PMC - PubMed
-
- World Health Organization (WHO) WHO Coronavirus (COVID-19) Dashboard—Deaths. 2023. [(accessed on 24 April 2025)]. Available online: https://data.who.int/dashboards/covid19/deaths.
-
- Mathieu E., Ritchie H., Rodés-Guirao L., Appel C., Gavrilov D., Giattino C., Hasell J., Macdonald B., Dattani S., Beltekian D., et al. COVID-19 Pandemic. Our World in Data. 2020. [(accessed on 24 April 2025)]. Available online: https://ourworldindata.org/coronavirus.
Grants and funding
LinkOut - more resources
Full Text Sources
Research Materials
Miscellaneous