Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr:196:105811.
doi: 10.1016/j.ijmedinf.2025.105811. Epub 2025 Jan 28.

Machine learning to predict stroke risk from routine hospital data: A systematic review

Affiliations
Free article

Machine learning to predict stroke risk from routine hospital data: A systematic review

William Heseltine-Carp et al. Int J Med Inform. 2025 Apr.
Free article

Abstract

Purpose: Stroke remains a leading cause of morbidity and mortality. Despite this, current risk stratification tools such as CHA2DS2-VASc and QRISK3 are of limited accuracy, particularly in those without a diagnosis of atrial-fibrillation. Hence, there is a need for more accurate stroke risk prediction models. Machine-learning (ML) may provide a solution to this by leveraging existing routine hospital databases to build accurate stroke risk prediction models and identify novel risk factors for stroke.

Aims: In this systematic review we appraise current research using ML to predict stroke risk from routine hospital data. Based on these findings we then highlight common methodological limitations and recommendations for future research.

Methods: In this review we identify 49 original research (38 in the general population and 11 in AF specific populations) articles from the PUBMED database from January-2013 to December-2024 using ML and routine hospital data to predict the risk of stroke.

Results: ML models were able to accurately predict stroke risk in both AF specific and general populations, with AUCs ranging from 0.64 to 0.99. Where tested, ML also consistently outperformed traditional risk stratification tool, such as CHA2DS2-VASc. ML also appeared useful in identifying several novel risk factors from electrocardiogram, laboratory test and echocardiography data. However, the quality of datasets were often limited, there was a high suspicion of overfitting and models often lacked calibration, external validation and explainability analysis.

Conclusion: Whilst ML has shown great potential in stroke prediction and identifying novel risk factors for stroke, improvements in study methodology is required prior to integration of ML into routine healthcare. Future research should adhere to the EQUATOR guidance on prediction models and encourage interdisciplinary collaboration between computer scientists and clinicians. Further prospective RCTs are also required to validate models in the clinical setting and the identify barriers of integrating ML into routine healthcare.

Keywords: Artificial intelligence; Ischaemic stroke; Machine learning; Risk evaluation; Routine hospital data; Stroke.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Publication types

LinkOut - more resources