Significant adverse prognostic events in patients with urosepsis: a machine learning based model development and validation study
- PMID: 40861482
- PMCID: PMC12370708
- DOI: 10.3389/fcimb.2025.1623109
Significant adverse prognostic events in patients with urosepsis: a machine learning based model development and validation study
Abstract
Background: Urosepsis is a subset of sepsis with a high mortality rate. Currently, the ranking of urosepsis in sepsis etiology is on the rise. Our goal is to use machine learning (ML) methods to construct and validate an interpretable prognosis prediction model for patients with urosepsis.
Method: Data were collected from the Intensive Care Medical Information Mart IV database version 3.1 and divided into a training cohort and a validation cohort in a 7:3 ratio. Random Forest (RF), Lasso, Boruta, and eXtreme Gradient Boosting (XGBoost) were used to identify the most influential variables in the model development dataset, and the optimal variables were selected based on achieving the λ1se value. Model development includes seven machine learning methods and ten cross validations. Accuracy and Decision Curve Analysis (DCA) were used to evaluate the performance of the model in order to select the optimal model. Internal validation of the model included area under the ROC curve (AUC), sensitivity, specificity, Matthews correlation coefficient, and F1-score. Finally, SHapley Additive exPlans (SHAP) was used to explain ML models.
Result: A total of 1389 patients with urosepsis were included. Optimal predictors were selected through statistical regularization, yielding a parsimonious set of 9 variables for model development. The performance of XGBoost model is the best and the accuracy of XGBoost was 0.818, with an AUC of 0.904 (95% CI: 0.886-0.923). The internal validation accuracy was 0.797, AUC was 0.869 (95% CI: 0.834-0.904), sensitivity was 0.797, specificity was 0.752, Matthews correlation coefficient was 0.597, and F1-score was 0.791. This indicates that the predictive model performs well in internal validation. SHAP-based summary graphs and diagrams were used to globally explain the XGBoost model.
Conclusion: ML demonstrates strong prognostic capability in urosepsis, with the SHAP method providing clinically intuitive explanations of model predictions. This enables clinicians to identify critical prognostic factors and personalize treatments. While our model achieved high predictive accuracy, its retrospective derivation from a single-center database necessitates external validation in diverse populations, which should be addressed through future prospective multicenter studies to establish clinical generalizability.
Keywords: MIMIC-IV database; SHAP; machine learning; prognostic model; urosepsis.
Copyright © 2025 Wei, Xu, Yang, Zhang, Wang and Wan.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures










References
-
- Farhadian M., Torkaman S., Mojarad F. (2020). Random forest algorithm to identify factors associated with sports-related dental injuries in 6 to 13-year-old athlete children in Hamadan, Iran-2018 -a cross-sectional study. BMC Sports Sci. Med. Rehabil. 12, 69. doi: 10.1186/s13102-020-00217-5, PMID: - DOI - PMC - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical