Machine learning-based prediction of 6-month functional recovery in hypertensive cerebral hemorrhage: insights from XGBoost and SHAP analysis

Menghui He¹, Zhongsheng Lu², Yiwei Lv¹, Zihai Cheng¹, Qiang Zhang², Xiaoqing Jin², Pei Han²

Affiliations

¹ Department of Graduate School, Qinghai University, Xining, China.
² Department of Neurosurgery, Qinghai Provincial People's Hospital, Xining, China.

PMID: 40534743
PMCID: PMC12173871
DOI: 10.3389/fneur.2025.1608341

Machine learning-based prediction of 6-month functional recovery in hypertensive cerebral hemorrhage: insights from XGBoost and SHAP analysis

Menghui He et al. Front Neurol. 2025.

. 2025 Jun 4:16:1608341.

doi: 10.3389/fneur.2025.1608341. eCollection 2025.

Authors

Menghui He¹, Zhongsheng Lu², Yiwei Lv¹, Zihai Cheng¹, Qiang Zhang², Xiaoqing Jin², Pei Han²

Affiliations

¹ Department of Graduate School, Qinghai University, Xining, China.
² Department of Neurosurgery, Qinghai Provincial People's Hospital, Xining, China.

PMID: 40534743
PMCID: PMC12173871
DOI: 10.3389/fneur.2025.1608341

Abstract

Background: The poor prognosis of hypertensive cerebral hemorrhage (HICH) remains high. The period of 3-6 months after onset is the most rapid phase of neurological recovery in hemorrhagic stroke patients. Accurate early prediction of 6-month functional outcomes is critical for optimizing therapeutic strategies. This study compared the predictive efficacy of multiple machine learning models to identify the optimal model for forecasting long-term prognosis in HICH patients.

Methods: We conducted a retrospective analysis of clinical data from 807 HICH patients admitted to Qinghai Provincial People's Hospital's Neurosurgery Department between June 2020 and June 2024. After data preprocessing, data from June 2020 to December 2023 (n = 716) were randomly split into training (n = 497) and test sets (n = 219) at a 7:3 ratio. Data from January to June 2024 (n = 91) served as an external validation set. Recursive Feature Elimination (RFE) was performed to identify optimal features, and repeated five-fold cross-validation minimized the risk of overfitting. Model performance was evaluated using Area Under the Curve (AUC) and Decision Curve Analysis (DCA) across XGBoost, Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN). The optimal model was interpreted via SHapley Additive exPlanations (SHAP).

Results: The 6-month poor prognosis rate among 807 HICH patients was 27.51%. The XGBoost model exhibited optimal performance in the training set (AUC = 0.921, 95% CI: 0.896-0.944) and demonstrated stability in the external validation set (AUC = 0.813, 95% CI: 0.728-0.899). DCA analysis showed that the XGBoost model provided higher net benefit than other models across threshold probabilities of 0%-20% and 56%-100%. SHAP analysis identified hematoma volume as the most critical predictor, with secondary contributions from Glasgow coma score, white blood cell count, age, serum albumin, and systolic blood pressure, among others.

Conclusion: XGBoost models demonstrate powerful accuracy in long-term prognosis prediction of HICH patients. The SHAP framework quantifies the specific contributions of key pathophysiological indicators to individual patient model predictions, enabling individualized risk stratification and strategic allocation of medical resources.

Keywords: SHAP; XGBoost; hypertensive cerebral hemorrhage; machine learning; predictive model.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
ROC curve analysis of five machine learning algorithms in the training dataset for predicting the long-term prognosis of HICH patients.

**Figure 2**
ROC curve analysis of five machine learning algorithms in the external validation set for predicting the long-term prognosis of HICH patients.

**Figure 3**
Decision curve analysis of five models plotting net benefits with different threshold probabilities.

**Figure 4**
The weights of variables importance.

**Figure 5**
The SHapley Additive exPlanation (SHAP) values.

**Figure 6**
SHapley Additive exPlanation (SHAP) force plot for two selected patients. **(A)** Person with a poor prognosis. **(B)** Person with a good prognosis.

See this image and copyright information in PMC

References

1. Li D, Wei M, Wu S, Zhang L, Zhang Z. Prognostic factors in acute hypertensive intracerebral hemorrhage: impact of minimally invasive puncture and drainage. Am J Transl Res. (2024) 16:5371–84. 10.62347/PQPP5715 - DOI - PMC - PubMed
1. Kase CS, Hanley DF. Intracerebral hemorrhage: advances in emergency care. Neurol Clin. (2021) 39:405–18. 10.1016/j.ncl.2021.02.002 - DOI - PubMed
1. Gross BA, Jankowitz BT, Friedlander RM. Cerebral intraparenchymal hemorrhage: a review. JAMA. (2019) 321:1295–303. 10.1001/jama.2019.2413 - DOI - PubMed
1. Zhang S, Zhang X, Ling Y, Li A. Predicting recurrent hypertensive intracerebral hemorrhage: derivation and validation of a risk-scoring model based on clinical characteristics. World Neurosurg. (2019) 127:e162–71. 10.1016/j.wneu.2019.03.024 - DOI - PubMed
1. Kwakkel G, Kollen B, Lindeman E. Understanding the pattern of functional recovery after stroke: facts and theories. Restor Neurol Neurosci. (2004) 22:281–99. 10.3233/RNN-2004-00282 - DOI - PubMed

LinkOut - more resources

Full Text Sources
- Frontiers Media SA
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Machine learning-based prediction of 6-month functional recovery in hypertensive cerebral hemorrhage: insights from XGBoost and SHAP analysis

Affiliations

Machine learning-based prediction of 6-month functional recovery in hypertensive cerebral hemorrhage: insights from XGBoost and SHAP analysis

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources