Multicenter Study

. 2025 Jan 8:14:1500326.

doi: 10.3389/fcimb.2024.1500326. eCollection 2024.

Interpretable machine learning-based prediction of 28-day mortality in ICU patients with sepsis: a multicenter retrospective study

Li Shen^{1

2}, Jiaqiang Wu³, Jianger Lan¹, Chao Chen⁴, Yi Wang⁵, Zhiping Li¹

Affiliations

¹ Department of Clinical Pharmacy, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China.
² Department of Pharmacy, Suzhou Hospital, Affiliated Hospital of Medical School, Nanjing University, Suzhou, Jiangsu, China.
³ School of Life Sciences and Biopharmaceutical Science, Shenyang Pharmaceutical University, Shenyang, China.
⁴ Department of Neonatology, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China.
⁵ Department of Neurology, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China.

PMID: 39844844
PMCID: PMC11751000
DOI: 10.3389/fcimb.2024.1500326

Multicenter Study

Interpretable machine learning-based prediction of 28-day mortality in ICU patients with sepsis: a multicenter retrospective study

Li Shen et al. Front Cell Infect Microbiol. 2025.

. 2025 Jan 8:14:1500326.

doi: 10.3389/fcimb.2024.1500326. eCollection 2024.

Authors

Li Shen^{1

2}, Jiaqiang Wu³, Jianger Lan¹, Chao Chen⁴, Yi Wang⁵, Zhiping Li¹

Affiliations

¹ Department of Clinical Pharmacy, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China.
² Department of Pharmacy, Suzhou Hospital, Affiliated Hospital of Medical School, Nanjing University, Suzhou, Jiangsu, China.
³ School of Life Sciences and Biopharmaceutical Science, Shenyang Pharmaceutical University, Shenyang, China.
⁴ Department of Neonatology, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China.
⁵ Department of Neurology, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China.

PMID: 39844844
PMCID: PMC11751000
DOI: 10.3389/fcimb.2024.1500326

Abstract

Background: Sepsis is a major cause of mortality in intensive care units (ICUs) and continues to pose a significant global health challenge, with sepsis-related deaths contributing substantially to the overall burden on healthcare systems worldwide. The primary objective was to construct and evaluate a machine learning (ML) model for forecasting 28-day all-cause mortality among ICU sepsis patients.

Methods: Data for the study was sourced from the eICU Collaborative Research Database (eICU-CRD) (version 2.0). The main outcome was 28-day all-cause mortality. Predictor selection for the final model was conducted using the least absolute shrinkage and selection operator (LASSO) regression analysis and the Boruta feature selection algorithm. Five machine learning algorithms including logistic regression (LR), decision tree (DT), extreme gradient boosting (XGBoost), support vector machine (SVM), and light gradient boosting machine (lightGBM) were employed to construct models using 10-fold cross-validation. Model performance was evaluated using AUC, accuracy, sensitivity, specificity, recall, and F1 score. Additionally, we performed an interpretability analysis on the model that showed the most stable performance.

Results: The final study cohort comprised 4564 patients, among whom 568 (12.4%) died within 28 days of ICU admission. The XGBoost algorithm demonstrated the most reliable performance, achieving an AUC of 0.821, balancing sensitivity (0.703) and specificity (0.798). The top three risk predictors of mortality included APACHE score, serum lactate levels, and AST.

Conclusion: ML models reliably predicted 28-day mortality in critically ill sepsis patients. Of the models evaluated, the XGBoost algorithm exhibited the most stable performance in identifying patients at elevated mortality risk. Model interpretability analysis identified crucial predictors, potentially informing clinical decisions for sepsis patients in the ICU.

Keywords: 28-day mortality; XGBoost; machine learning; multicenter retrospective study; sepsis.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 2**
Features selection by LASSO regression and Boruta. **(A)** The variation characteristics of the LASSO coefficient. Selection of the optimal parameter Lambda (λ) in LASSO involved plotting log (λ) on the X-axis and regression coefficients on the Y-axis. The different colored lines represented the different variables. **(B)** Optimization parameters (λ) of the LASSO model were selected by 10-fold cross-validation. The left dashed line represents λmin (minimum cross-validated error), while the right dashed line indicates λ1se (the largest λ within one standard error of λmin). **(C)** Feature identification via Boruta algorithm. The X-axis represented all features, and the Y-axis was the Z-value of each feature. The green boxes represented the initial 26 significant variables, while the yellow ones denoted tentative, and the red ones indicated unimportant.

**Figure 3**
Receiver operating characteristic curve and of the five models. **(A)** ROC of the training set. **(B)** ROC of the validation set. DT, decision tree; LGBM, light gradient boosting machine; LR, logistic regression; SVM, support vector machine; XGBoost, extreme gradient boosting.

**Figure 4**
The SHAP analysis of the XGBoost model. **(A)** A bar plot displaying the mean SHAP value for the top ten variables. **(B)** The beeswarm plots displayed the distribution of the top ten variables, with variable values represented by different colors. Each sample was represented by a colored point. The x-axis represented the SHAP value, while the color coding indicated the feature values. **(C)** SHAP waterfall plot for case 1. **(D)** SHAP waterfall plot for case 2.

See this image and copyright information in PMC

References

1. Alhamzawi R., Ali H. T. M. (2018). The Bayesian adaptive lasso regression. Math Biosci. 303, 75–82. doi: 10.1016/j.mbs.2018.06.004 - DOI - PubMed
1. Baysan M., Arbous M. S., Steyerberg E. W., van der Bom J. G. (2022). Prediction of inhospital mortality in critically ill patients with sepsis: confirmation of the added value of 24-hour lactate to acute physiology and chronic health evaluation IV. Crit. Care Explor. 4, e0750. doi: 10.1097/CCE.0000000000000750 - DOI - PMC - PubMed
1. Cui S. H., Liang C. Y., Hao Y. F. (2024). Analysis of risk factors affecting the prognosis of patients with sepsis and construction of nomogram prediction model. Eur. Rev. Med. Pharmacol. Sci. 28 (6), 2409–2418. doi: 10.26355/eurrev_202403_35748 - DOI - PubMed
1. Dankl D., Rezar R., Mamandipoor B., Zhou Z, Wernly S, Wernly B, et al. (2022). Red cell distribution width is independently associated with mortality in sepsis. Med. Princ Pract. 31 (2), 187–194. doi: 10.1159/000522261 - DOI - PMC - PubMed
1. Ejiyi C. J., Qin Z., Ukwuoma C. C., Nneji G. U., Monday H. N., Ejiyi M. B., et al. (2024). Comparative performance analysis of Boruta, SHAP, and Borutashap for disease diagnosis: A study with multiple machine learning algorithms. Network, 1–38. doi: 10.1080/0954898X.2024.2331506 - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Frontiers Media SA
- PubMed Central
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Interpretable machine learning-based prediction of 28-day mortality in ICU patients with sepsis: a multicenter retrospective study

Affiliations

Interpretable machine learning-based prediction of 28-day mortality in ICU patients with sepsis: a multicenter retrospective study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical