Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Dec 13;15(12):e50426.
doi: 10.7759/cureus.50426. eCollection 2023 Dec.

Machine Learning Models for Predicting Stroke Mortality in Malaysia: An Application and Comparative Analysis

Affiliations

Machine Learning Models for Predicting Stroke Mortality in Malaysia: An Application and Comparative Analysis

Che Muhammad Nur Hidayat Che Nawi et al. Cureus. .

Abstract

Background Stroke is a significant public health concern characterized by increasing mortality and morbidity. Accurate long-term outcome prediction for acute stroke patients, particularly stroke mortality, is vital for clinical decision-making and prognostic management. This study aimed to develop and compare various prognostic models for stroke mortality prediction. Methods In a retrospective cohort study from January 2016 to December 2021, we collected data from patients diagnosed with acute stroke from five selected hospitals. Data contained variables on demographics, comorbidities, and interventions retrieved from medical records. The cohort comprised 950 patients with 20 features. Outcomes (censored vs. death) were determined by linking data with the Malaysian National Mortality Registry. We employed three common survival modeling approaches, the Cox proportional hazard regression (Cox), support vector machine (SVM), and random survival forest (RSF), while enhancing the Cox model with Elastic Net (Cox-EN) for feature selection. Models were compared using the concordance index (C-index), time-dependent area under the curve (AUC), and discrimination index (D-index), with calibration assessed by the Brier score. Results The support vector machine (SVM) model excelled among the four, with three-month, one-year, and three-year time-dependent AUC values of 0.842, 0.846, and 0.791; a D-index of 5.31 (95% CI: 3.86, 7.30); and a C-index of 0.803 (95% CI: 0.758, 0.847). All models exhibited robust calibration, with three-month, one-year, and three-year Brier scores ranging from 0.103 to 0.220, all below 0.25. Conclusion The support vector machine (SVM) model demonstrated superior discriminative performance, suggesting its efficacy in developing prognostic models for stroke mortality. This study enhances stroke mortality prediction and supports clinical decision-making, emphasizing the utility of the support vector machine method.

Keywords: comparative analysis; machine learning; malaysia; prediction models; stroke mortality.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Graphical illustration of the study workflow
HCTM UKM: Hospital Canselor Tuanku Muhriz Universiti Kebangsaan Malaysia; HSJ: Hospital Seberang Jaya; HSNZ: Hospital Sultanah Nur Zahirah; HRPZ II: Hospital Raja Perempuan Zainab II; HUSM: Hospital Universiti Sains Malaysia; Cox: Cox proportional hazard regression; Cox-EN: Cox model with elastic net; C-index: Concordance index; D-index: Discrimination index.
Figure 2
Figure 2. Overall survival of acute stroke patients
Figure 3
Figure 3. The coefficient of features change for varying alphas for the Cox-EN model
nihss: National Institutes of Health Stroke Scale; dm: Diabetic status; age: Age in years; gcs_reduct: Glasgow Coma Scale reduction; ethnicity: Ethnicity; ss: Study site; sex: Gender of the patients.
Figure 4
Figure 4. The important coefficient of each feature corresponding to the optimal α by an elastic net
age: Age in years; nihss: National Institutes of Health Stroke Scale; dm: Diabetic status; sex: Gender of the patients; ethnicity: Ethnicity; gcs_reduct: Glasgow Coma Scale reduction; lipid: Hyperlipidemia; hf_ihd: Heart disease; married: Marital status; hpt: Hypertension status; ss: Study site; af: Atrial fibrillation.
Figure 5
Figure 5. The important coefficient of each feature by random survival forest
af: Atrial fibrillation; hpt: Hypertension status; ethnicity: Ethnicity; hf_ihd: Heart disease; married: Marital status; lipid: Hyperlipidemia; sex: Gender of the patients; ss: Study site; dm: Diabetic status; gcs_reduct: Glasgow Coma Scale reduction; nihss: National Institutes of Health Stroke Scale; age: Age in years.
Figure 6
Figure 6. Time-dependent receiver operating characteristic curves of models at three months, one year, and three years
Cox: Cox proportional hazard regression; Cox-EN: Cox model with elastic net; RSF: Random survival forest; SVM: Support vector machine.
Figure 7
Figure 7. Time-dependent AUC of models over time
AUC: Area under the curve; Cox: Cox proportional hazard regression; Cox-EN: Cox model with elastic net; RSF: Random survival forest; SVM: Support vector machine. The blue dotted line represents the mean area under the curve.
Figure 8
Figure 8. Survival curves of high-risk and low-risk groups divided according to the risk score from (A) Cox: Cox proportional hazard regression, (B) Cox-EN: Cox model with elastic net, (C) SVM: support vector machine, and (D) RSF: random survival forest
Figure 9
Figure 9. Performance of the SVM model across different alpha values during hyperparameter tuning

Similar articles

Cited by

References

    1. Global, regional, and national burden of stroke and its risk factors, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019. GBD 2019 Stroke Collaborators. Lancet Neurol. 2021;20:795–820. - PMC - PubMed
    1. Clinical prediction models for mortality and functional outcome following ischemic stroke: a systematic review and meta-analysis. Fahey M, Crayton E, Wolfe C, Douiri A. PLoS One. 2018;13:185402. - PMC - PubMed
    1. The logistic EuroSCORE. Roques F, Michel P, Goldstone AR, Nashef SAM. Eur Heart J. 2003;24:882–883. - PubMed
    1. C-reactive protein and parental history improve global cardiovascular risk prediction: the Reynolds risk score for men. Ridker PM, Paynter NP, Rifai N, Gaziano JM, Cook NR. Circulation. 2008;118:2243–2251. - PMC - PubMed
    1. Derivation and validation of QRISK, a new cardiovascular disease risk score for the United Kingdom: prospective open cohort study. Hippisley-Cox J, Coupland C, Vinogradova Y, Robson J, May M, Brindle P. BMJ. 2007;335:136. - PMC - PubMed

LinkOut - more resources