Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 4:16:1608341.
doi: 10.3389/fneur.2025.1608341. eCollection 2025.

Machine learning-based prediction of 6-month functional recovery in hypertensive cerebral hemorrhage: insights from XGBoost and SHAP analysis

Affiliations

Machine learning-based prediction of 6-month functional recovery in hypertensive cerebral hemorrhage: insights from XGBoost and SHAP analysis

Menghui He et al. Front Neurol. .

Abstract

Background: The poor prognosis of hypertensive cerebral hemorrhage (HICH) remains high. The period of 3-6 months after onset is the most rapid phase of neurological recovery in hemorrhagic stroke patients. Accurate early prediction of 6-month functional outcomes is critical for optimizing therapeutic strategies. This study compared the predictive efficacy of multiple machine learning models to identify the optimal model for forecasting long-term prognosis in HICH patients.

Methods: We conducted a retrospective analysis of clinical data from 807 HICH patients admitted to Qinghai Provincial People's Hospital's Neurosurgery Department between June 2020 and June 2024. After data preprocessing, data from June 2020 to December 2023 (n = 716) were randomly split into training (n = 497) and test sets (n = 219) at a 7:3 ratio. Data from January to June 2024 (n = 91) served as an external validation set. Recursive Feature Elimination (RFE) was performed to identify optimal features, and repeated five-fold cross-validation minimized the risk of overfitting. Model performance was evaluated using Area Under the Curve (AUC) and Decision Curve Analysis (DCA) across XGBoost, Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN). The optimal model was interpreted via SHapley Additive exPlanations (SHAP).

Results: The 6-month poor prognosis rate among 807 HICH patients was 27.51%. The XGBoost model exhibited optimal performance in the training set (AUC = 0.921, 95% CI: 0.896-0.944) and demonstrated stability in the external validation set (AUC = 0.813, 95% CI: 0.728-0.899). DCA analysis showed that the XGBoost model provided higher net benefit than other models across threshold probabilities of 0%-20% and 56%-100%. SHAP analysis identified hematoma volume as the most critical predictor, with secondary contributions from Glasgow coma score, white blood cell count, age, serum albumin, and systolic blood pressure, among others.

Conclusion: XGBoost models demonstrate powerful accuracy in long-term prognosis prediction of HICH patients. The SHAP framework quantifies the specific contributions of key pathophysiological indicators to individual patient model predictions, enabling individualized risk stratification and strategic allocation of medical resources.

Keywords: SHAP; XGBoost; hypertensive cerebral hemorrhage; machine learning; predictive model.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
ROC curve analysis of five machine learning algorithms in the training dataset for predicting the long-term prognosis of HICH patients.
Figure 2
Figure 2
ROC curve analysis of five machine learning algorithms in the external validation set for predicting the long-term prognosis of HICH patients.
Figure 3
Figure 3
Decision curve analysis of five models plotting net benefits with different threshold probabilities.
Figure 4
Figure 4
The weights of variables importance.
Figure 5
Figure 5
The SHapley Additive exPlanation (SHAP) values.
Figure 6
Figure 6
SHapley Additive exPlanation (SHAP) force plot for two selected patients. (A) Person with a poor prognosis. (B) Person with a good prognosis.

Similar articles

References

    1. Li D, Wei M, Wu S, Zhang L, Zhang Z. Prognostic factors in acute hypertensive intracerebral hemorrhage: impact of minimally invasive puncture and drainage. Am J Transl Res. (2024) 16:5371–84. 10.62347/PQPP5715 - DOI - PMC - PubMed
    1. Kase CS, Hanley DF. Intracerebral hemorrhage: advances in emergency care. Neurol Clin. (2021) 39:405–18. 10.1016/j.ncl.2021.02.002 - DOI - PubMed
    1. Gross BA, Jankowitz BT, Friedlander RM. Cerebral intraparenchymal hemorrhage: a review. JAMA. (2019) 321:1295–303. 10.1001/jama.2019.2413 - DOI - PubMed
    1. Zhang S, Zhang X, Ling Y, Li A. Predicting recurrent hypertensive intracerebral hemorrhage: derivation and validation of a risk-scoring model based on clinical characteristics. World Neurosurg. (2019) 127:e162–71. 10.1016/j.wneu.2019.03.024 - DOI - PubMed
    1. Kwakkel G, Kollen B, Lindeman E. Understanding the pattern of functional recovery after stroke: facts and theories. Restor Neurol Neurosci. (2004) 22:281–99. 10.3233/RNN-2004-00282 - DOI - PubMed

LinkOut - more resources