Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar 4:16:1522868.
doi: 10.3389/fneur.2025.1522868. eCollection 2025.

Interpretable prediction of stroke prognosis: SHAP for SVM and nomogram for logistic regression

Affiliations

Interpretable prediction of stroke prognosis: SHAP for SVM and nomogram for logistic regression

Kun Guo et al. Front Neurol. .

Abstract

Background: Ischemic Stroke (IS) stands as a leading cause of mortality and disability globally, with an anticipated increase in IS-related fatalities by 2030. Despite therapeutic advancements, many patients still lack effective interventions, underscoring the need for improved prognostic assessment tools. Machine Learning (ML) models have emerged as promising tools for predicting stroke prognosis, surpassing traditional methods in accuracy and speed.

Objective: The aim of this study was to develop and validate ML algorithms for predicting the 6-month prognosis of patients with Acute Cerebral Infarction, using clinical data from two medical centers in China, and to assess the feasibility of implementing Explainable ML in clinical settings.

Methods: A retrospective observational cohort study was conducted involving 398 patients diagnosed with Acute Cerebral Infarction from January 2023 to February 2024. The dataset included demographic information, medical histories, clinical evaluations, and laboratory results. Six ML models were constructed: Logistic Regression, Naive Bayes, Support Vector Machine (SVM), Random Forest, XGBoost, and AdaBoost. Model performance was evaluated using the Area Under the Receiver Operating Characteristic curve (AUC), sensitivity, specificity, predictive values, and F1 score, with five-fold cross-validation to ensure robustness.

Results: The training set, identified key variables associated with stroke prognosis, including hypertension, diabetes, and smoking history. The SVM model demonstrated exceptional performance, with an AUC of 0.9453 on the training set and 0.9213 on the validation set. A Nomogram based on Logistic Regression was developed for visualizing prognostic risk, incorporating factors such as the National Institutes of Health Stroke Scale (NIHSS) score, Barthel Index (BI), Watanabe Drinking Test (KWST) score, Platelet Distribution Width (PDW), and others. Our models showed high predictive accuracy and stability across both datasets.

Conclusion: This study presents a robust ML approach for predicting stroke prognosis, with the SVM model and Nomogram providing valuable tools for clinical decision-making. By incorporating advanced ML techniques, we enhance the precision of prognostic assessments and offer a theoretical and practical framework for clinical application.

Keywords: clinical decision support; ischemic stroke; machine learning; predictive modeling; prognosis.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Workflow of the patient selection.
Figure 2
Figure 2
The LASSO model, employing a tuning parameter (λ) and utilizing fivefold cross-validation with both minimum and 1se criteria (B), was used to select radiomics features during the feature selection process (A).
Figure 3
Figure 3
A nomogram based on logistic regression for clinical decision-making.
Figure 4
Figure 4
Receiver Operating Characteristic (ROC) curves for logistic regression in the training set (A) and validation set (B).
Figure 5
Figure 5
Calibration curves for logistic regression: training set (A) and validation set (B).
Figure 6
Figure 6
The clinical decision curve (DCA) of the logistic regression model for the training set and the validation set.
Figure 7
Figure 7
SVM model SHAP value summary plot (A), and SVM model SHAP value scatter plot (B).
Figure 8
Figure 8
SHAP force plot showing feature contributions.

Similar articles

References

    1. Fan J, Li X, Yu X, Liu Z, Jiang Y, Fang Y, et al. . Global burden, risk factor analysis, and prediction study of ischemic stroke, 1990-2030. Neurology. (2023) 101:e137–50. doi: 10.1212/WNL.0000000000207387, PMID: - DOI - PMC - PubMed
    1. Donkor ES. Stroke in the 21(st) century: a snapshot of the burden, epidemiology, and quality of life. Stroke Res Treat. (2018) 2018:3238165. doi: 10.1155/2018/3238165 - DOI - PMC - PubMed
    1. Campbell BCV, De Silva DA, Macleod MR, Coutts SB, Schwamm LH, Davis SM, et al. . Ischaemic stroke. Nat Rev Dis Primers. (2019) 5:70. doi: 10.1038/s41572-019-0118-8, PMID: - DOI - PubMed
    1. Feigin VL, Stark BA, Johnson CO, Roth GA, Bisignano C, Abady GG, et al. . Global, regional, and national burden of stroke and its risk factors, 1990-2019: a systematic analysis for the global burden of disease study 2019. Lancet Neurol. (2021) 20:795–820. doi: 10.1016/S1474-4422(21)00252-0, PMID: - DOI - PMC - PubMed
    1. Mead GE, Sposato LA, Sampaio Silva G, Yperzeele L, Wu S, Kutlubaev M, et al. . A systematic review and synthesis of global stroke guidelines on behalf of the world stroke organization. Int J Stroke. (2023) 18:499–531. doi: 10.1177/17474930231156753, PMID: - DOI - PMC - PubMed

LinkOut - more resources