[Constructing a predictive model for the death risk of patients with septic shock based on supervised machine learning algorithms]
- PMID: 38813626
- DOI: 10.3760/cma.j.cn121430-20230930-00832
[Constructing a predictive model for the death risk of patients with septic shock based on supervised machine learning algorithms]
Abstract
Objective: To construct and validate the best predictive model for 28-day death risk in patients with septic shock based on different supervised machine learning algorithms.
Methods: The patients with septic shock meeting the Sepsis-3 criteria were selected from Medical Information Mart for Intensive Care-IV v2.0 (MIMIC-IV v2.0). According to the principle of random allocation, 70% of these patients were used as the training set, and 30% as the validation set. Relevant predictive variables were extracted from three aspects: demographic characteristics and basic vital signs, serum indicators within 24 hours of intensive care unit (ICU) admission and complications possibly affecting indicators, functional scoring and advanced life support. The predictive efficacy of models constructed using five mainstream machine learning algorithms including decision tree classification and regression tree (CART), random forest (RF), support vector machine (SVM), linear regression (LR), and super learner [SL; combined CART, RF and extreme gradient boosting (XGBoost)] for 28-day death in patients with septic shock was compared, and the best algorithm model was selected. The optimal predictive variables were determined by intersecting the results from LASSO regression, RF, and XGBoost algorithms, and a predictive model was constructed. The predictive efficacy of the model was validated by drawing receiver operator characteristic curve (ROC curve), the accuracy of the model was assessed using calibration curves, and the practicality of the model was verified through decision curve analysis (DCA).
Results: A total of 3 295 patients with septic shock were included, with 2 164 surviving and 1 131 dying within 28 days, resulting in a mortality of 34.32%. Of these, 2 307 were in the training set (with 792 deaths within 28 days, a mortality of 34.33%), and 988 in the validation set (with 339 deaths within 28 days, a mortality of 34.31%). Five machine learning models were established based on the training set data. After including variables at three aspects, the area under the ROC curve (AUC) of RF, SVM, and LR machine learning algorithm models for predicting 28-day death in septic shock patients in the validation set was 0.823 [95% confidence interval (95%CI) was 0.795-0.849], 0.823 (95%CI was 0.796-0.849), and 0.810 (95%CI was 0.782-0.838), respectively, which were higher than that of the CART algorithm model (AUC = 0.750, 95%CI was 0.717-0.782) and SL algorithm model (AUC = 0.756, 95%CI was 0.724-0.789). Thus above three algorithm models were determined to be the best algorithm models. After integrating variables from three aspects, 16 optimal predictive variables were identified through intersection by LASSO regression, RF, and XGBoost algorithms, including the highest pH value, the highest albumin (Alb), the highest body temperature, the lowest lactic acid (Lac), the highest Lac, the highest serum creatinine (SCr), the highest Ca2+, the lowest hemoglobin (Hb), the lowest white blood cell count (WBC), age, simplified acute physiology score III (SAPS III), the highest WBC, acute physiology score III (APS III), the lowest Na+, body mass index (BMI), and the shortest activated partial thromboplastin time (APTT) within 24 hours of ICU admission. ROC curve analysis showed that the Logistic regression model constructed with above 16 optimal predictive variables was the best predictive model, with an AUC of 0.806 (95%CI was 0.778-0.835) in the validation set. The calibration curve and DCA curve showed that this model had high accuracy and the highest net benefit could reach 0.3, which was significantly outperforming traditional models based on single functional score [APS III score, SAPS III score, and sequential organ failure assessment (SOFA) score] with AUC (95%CI) of 0.746 (0.715-0.778), 0.765 (0.734-0.796), and 0.625 (0.589-0.661), respectively.
Conclusions: The Logistic regression model, constructed using 16 optimal predictive variables including pH value, Alb, body temperature, Lac, SCr, Ca2+, Hb, WBC, SAPS III score, APS III score, Na+, BMI, and APTT, is identified as the best predictive model for the 28-day death risk in patients with septic shock. Its performance is stable, with high discriminative ability and accuracy.
Similar articles
-
[Construction of a predictive model for in-hospital mortality of sepsis patients in intensive care unit based on machine learning].Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2023 Jul;35(7):696-701. doi: 10.3760/cma.j.cn121430-20221219-01104. Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2023. PMID: 37545445 Chinese.
-
[Development and validation of a prognostic model for patients with sepsis in intensive care unit].Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2023 Aug;35(8):800-806. doi: 10.3760/cma.j.cn121430-20230103-00003. Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2023. PMID: 37593856 Chinese.
-
[Construction of a predictive model for early acute kidney injury risk in intensive care unit septic shock patients based on machine learning].Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2022 Mar;34(3):255-259. doi: 10.3760/cma.j.cn121430-20211126-01790. Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2022. PMID: 35574741 Chinese.
-
The CMLA score: A novel tool for early prediction of renal replacement therapy in patients with cardiogenic shock.Curr Probl Cardiol. 2024 Dec;49(12):102870. doi: 10.1016/j.cpcardiol.2024.102870. Epub 2024 Sep 27. Curr Probl Cardiol. 2024. PMID: 39343053 Review.
-
Methodological Review of Classification Trees for Risk Stratification: An Application Example in the Obesity Paradox.Nutrients. 2025 May 31;17(11):1903. doi: 10.3390/nu17111903. Nutrients. 2025. PMID: 40507172 Free PMC article. Review.
Cited by
-
Developing and validating a machine learning-based model for predicting in-hospital mortality among ICU-admitted heart failure patients: A study utilizing the MIMIC-III database.Digit Health. 2025 Apr 21;11:20552076251335705. doi: 10.1177/20552076251335705. eCollection 2025 Jan-Dec. Digit Health. 2025. PMID: 40297352 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Miscellaneous