Machine learning-based prediction of 6-month functional recovery in hypertensive cerebral hemorrhage: insights from XGBoost and SHAP analysis
- PMID: 40534743
- PMCID: PMC12173871
- DOI: 10.3389/fneur.2025.1608341
Machine learning-based prediction of 6-month functional recovery in hypertensive cerebral hemorrhage: insights from XGBoost and SHAP analysis
Abstract
Background: The poor prognosis of hypertensive cerebral hemorrhage (HICH) remains high. The period of 3-6 months after onset is the most rapid phase of neurological recovery in hemorrhagic stroke patients. Accurate early prediction of 6-month functional outcomes is critical for optimizing therapeutic strategies. This study compared the predictive efficacy of multiple machine learning models to identify the optimal model for forecasting long-term prognosis in HICH patients.
Methods: We conducted a retrospective analysis of clinical data from 807 HICH patients admitted to Qinghai Provincial People's Hospital's Neurosurgery Department between June 2020 and June 2024. After data preprocessing, data from June 2020 to December 2023 (n = 716) were randomly split into training (n = 497) and test sets (n = 219) at a 7:3 ratio. Data from January to June 2024 (n = 91) served as an external validation set. Recursive Feature Elimination (RFE) was performed to identify optimal features, and repeated five-fold cross-validation minimized the risk of overfitting. Model performance was evaluated using Area Under the Curve (AUC) and Decision Curve Analysis (DCA) across XGBoost, Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN). The optimal model was interpreted via SHapley Additive exPlanations (SHAP).
Results: The 6-month poor prognosis rate among 807 HICH patients was 27.51%. The XGBoost model exhibited optimal performance in the training set (AUC = 0.921, 95% CI: 0.896-0.944) and demonstrated stability in the external validation set (AUC = 0.813, 95% CI: 0.728-0.899). DCA analysis showed that the XGBoost model provided higher net benefit than other models across threshold probabilities of 0%-20% and 56%-100%. SHAP analysis identified hematoma volume as the most critical predictor, with secondary contributions from Glasgow coma score, white blood cell count, age, serum albumin, and systolic blood pressure, among others.
Conclusion: XGBoost models demonstrate powerful accuracy in long-term prognosis prediction of HICH patients. The SHAP framework quantifies the specific contributions of key pathophysiological indicators to individual patient model predictions, enabling individualized risk stratification and strategic allocation of medical resources.
Keywords: SHAP; XGBoost; hypertensive cerebral hemorrhage; machine learning; predictive model.
Copyright © 2025 He, Lu, Lv, Cheng, Zhang, Jin and Han.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures






Similar articles
-
Interpretable prediction of hospital mortality in bleeding critically ill patients based on machine learning and SHAP.BMC Med Inform Decis Mak. 2025 Jul 15;25(1):263. doi: 10.1186/s12911-025-03101-9. BMC Med Inform Decis Mak. 2025. PMID: 40665292 Free PMC article.
-
Supervised Machine Learning Models for Predicting Sepsis-Associated Liver Injury in Patients With Sepsis: Development and Validation Study Based on a Multicenter Cohort Study.J Med Internet Res. 2025 May 26;27:e66733. doi: 10.2196/66733. J Med Internet Res. 2025. PMID: 40418571 Free PMC article.
-
Construction and validation of HBV-ACLF bacterial infection diagnosis model based on machine learning.BMC Infect Dis. 2025 Jul 1;25(1):847. doi: 10.1186/s12879-025-11199-5. BMC Infect Dis. 2025. PMID: 40596896 Free PMC article.
-
Development and validation of a machine learning-based risk prediction model for stroke-associated pneumonia in older adult hemorrhagic stroke.Front Neurol. 2025 Jun 18;16:1591570. doi: 10.3389/fneur.2025.1591570. eCollection 2025. Front Neurol. 2025. PMID: 40606135 Free PMC article.
-
Interpretable machine learning for predicting isolated basal septal hypertrophy.PLoS One. 2025 Jun 30;20(6):e0325992. doi: 10.1371/journal.pone.0325992. eCollection 2025. PLoS One. 2025. PMID: 40587445 Free PMC article.
References
LinkOut - more resources
Full Text Sources