Prediction of Long-Term Stroke Recurrence Using Machine Learning Models
- PMID: 33804724
- PMCID: PMC8003970
- DOI: 10.3390/jcm10061286
Prediction of Long-Term Stroke Recurrence Using Machine Learning Models
Abstract
Background: The long-term risk of recurrent ischemic stroke, estimated to be between 17% and 30%, cannot be reliably assessed at an individual level. Our goal was to study whether machine-learning can be trained to predict stroke recurrence and identify key clinical variables and assess whether performance metrics can be optimized.
Methods: We used patient-level data from electronic health records, six interpretable algorithms (Logistic Regression, Extreme Gradient Boosting, Gradient Boosting Machine, Random Forest, Support Vector Machine, Decision Tree), four feature selection strategies, five prediction windows, and two sampling strategies to develop 288 models for up to 5-year stroke recurrence prediction. We further identified important clinical features and different optimization strategies.
Results: We included 2091 ischemic stroke patients. Model area under the receiver operating characteristic (AUROC) curve was stable for prediction windows of 1, 2, 3, 4, and 5 years, with the highest score for the 1-year (0.79) and the lowest score for the 5-year prediction window (0.69). A total of 21 (7%) models reached an AUROC above 0.73 while 110 (38%) models reached an AUROC greater than 0.7. Among the 53 features analyzed, age, body mass index, and laboratory-based features (such as high-density lipoprotein, hemoglobin A1c, and creatinine) had the highest overall importance scores. The balance between specificity and sensitivity improved through sampling strategies.
Conclusion: All of the selected six algorithms could be trained to predict the long-term stroke recurrence and laboratory-based variables were highly associated with stroke recurrence. The latter could be targeted for personalized interventions. Model performance metrics could be optimized, and models can be implemented in the same healthcare system as intelligent decision support for targeted intervention.
Keywords: artificial intelligence; clinical decision support system; electronic health record; explainable machine learning; healthcare; interpretable machine learning; ischemic stroke; machine learning; outcome prediction; recurrent stroke.
Conflict of interest statement
The authors declare no conflict of interest.
Figures





Similar articles
-
Early Detection of Septic Shock Onset Using Interpretable Machine Learners.J Clin Med. 2021 Jan 15;10(2):301. doi: 10.3390/jcm10020301. J Clin Med. 2021. PMID: 33467539 Free PMC article.
-
Machine Learning-Enabled 30-Day Readmission Model for Stroke Patients.Front Neurol. 2021 Mar 31;12:638267. doi: 10.3389/fneur.2021.638267. eCollection 2021. Front Neurol. 2021. PMID: 33868147 Free PMC article.
-
Predicting post-stroke pneumonia using deep neural network approaches.Int J Med Inform. 2019 Dec;132:103986. doi: 10.1016/j.ijmedinf.2019.103986. Epub 2019 Oct 1. Int J Med Inform. 2019. PMID: 31629312
-
Data-driven modeling and prediction of blood glucose dynamics: Machine learning applications in type 1 diabetes.Artif Intell Med. 2019 Jul;98:109-134. doi: 10.1016/j.artmed.2019.07.007. Epub 2019 Jul 26. Artif Intell Med. 2019. PMID: 31383477 Review.
-
State of the Art of Machine Learning-Enabled Clinical Decision Support in Intensive Care Units: Literature Review.JMIR Med Inform. 2022 Mar 3;10(3):e28781. doi: 10.2196/28781. JMIR Med Inform. 2022. PMID: 35238790 Free PMC article. Review.
Cited by
-
OptiSelect and EnShap: Integrating machine learning and game theory for ischemic stroke prediction.PLoS One. 2025 Aug 13;20(8):e0328967. doi: 10.1371/journal.pone.0328967. eCollection 2025. PLoS One. 2025. PMID: 40802707 Free PMC article.
-
Longitudinal Data to Enhance Dynamic Stroke Risk Prediction.Healthcare (Basel). 2022 Oct 27;10(11):2134. doi: 10.3390/healthcare10112134. Healthcare (Basel). 2022. PMID: 36360476 Free PMC article.
-
Imputation of missing values for electronic health record laboratory data.NPJ Digit Med. 2021 Oct 11;4(1):147. doi: 10.1038/s41746-021-00518-0. NPJ Digit Med. 2021. PMID: 34635760 Free PMC article.
-
A machine learning model for visualization and dynamic clinical prediction of stroke recurrence in acute ischemic stroke patients: A real-world retrospective study.Front Neurosci. 2023 Mar 27;17:1130831. doi: 10.3389/fnins.2023.1130831. eCollection 2023. Front Neurosci. 2023. PMID: 37051146 Free PMC article.
-
An integrated pipeline for prediction of Clostridioides difficile infection.Sci Rep. 2023 Oct 2;13(1):16532. doi: 10.1038/s41598-023-41753-7. Sci Rep. 2023. PMID: 37783691 Free PMC article.
References
-
- Benjamin E.J., Blaha M.J., Chiuve S.E., Cushman M., Das S.R., de Ferranti S.D., Floyd J., Fornage M., Gillespie C., Isasi C.R., et al. Heart disease and stroke statistics—2017 update a report from the American heart association. Circulation. 2017;135:e146–e603. doi: 10.1161/CIR.0000000000000485. - DOI - PMC - PubMed
LinkOut - more resources
Full Text Sources
Other Literature Sources