Explainable prediction of daily hospitalizations for cerebrovascular disease using stacked ensemble learning
- PMID: 37024922
- PMCID: PMC10080841
- DOI: 10.1186/s12911-023-02159-7
Explainable prediction of daily hospitalizations for cerebrovascular disease using stacked ensemble learning
Abstract
Background: With the prevalence of cerebrovascular disease (CD) and the increasing strain on healthcare resources, forecasting the healthcare demands of cerebrovascular patients has significant implications for optimizing medical resources.
Methods: In this study, a stacking ensemble model comprised of four base learners (ridge regression, random forest, gradient boosting decision tree, and artificial neural network) and a meta learner (elastic net) was proposed for predicting the daily number of hospital admissions (HAs) for CD using the historical HAs data, air quality data, and meteorological data in Chengdu, China from 2015 to 2018. To solve the label imbalance problem, a re-weighting method based on label distribution smoothing was integrated into the meta learner. We trained the model using the data from 2015 to 2017 and evaluated its predictive ability using the data in 2018 based on four metrics, including mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), and coefficient of determination (R2). In addition, the SHapley Additive exPlanations (SHAP) framework was applied to provide explanation for the prediction of our stacking model.
Results: Our proposed model outperformed all the base learners and long short-term memory (LSTM) on two datasets. Particularly, compared with the optimal results obtained by individual models, the MAE, RMSE, and MAPE of the stacking model decreased by 13.9%, 12.7%, and 5.8%, respectively, and the R2 improved by 6.8% on CD dataset. The model explanation demonstrated that environmental features played a role in further improving the model performance and identified that high temperature and high concentrations of gaseous air pollutants might strongly associate with an increased risk of CD.
Conclusions: Our stacking model considering environmental exposure is efficient in predicting daily HAs for CD and has practical value in early warning and healthcare resource allocation.
Keywords: Cerebrovascular disease; Environmental exposure; Hospital admissions; SHAP value; Stacking ensemble model.
© 2023. The Author(s).
Conflict of interest statement
The authors declare that they have no competing interests.
Figures







Similar articles
-
Prediction of gully erosion susceptibility through the lens of the SHapley Additive exPlanations (SHAP) method using a stacking ensemble model.J Environ Manage. 2025 May;383:125478. doi: 10.1016/j.jenvman.2025.125478. Epub 2025 Apr 25. J Environ Manage. 2025. PMID: 40286423
-
Predicting the incidence of infectious diarrhea with symptom surveillance data using a stacking-based ensembled model.BMC Infect Dis. 2024 Feb 26;24(1):265. doi: 10.1186/s12879-024-09138-x. BMC Infect Dis. 2024. PMID: 38408967 Free PMC article.
-
Seasonal prediction of daily PM2.5 concentrations with interpretable machine learning: a case study of Beijing, China.Environ Sci Pollut Res Int. 2022 Jun;29(30):45821-45836. doi: 10.1007/s11356-022-18913-9. Epub 2022 Feb 12. Environ Sci Pollut Res Int. 2022. PMID: 35150424
-
Improving the precision of modeling the incidence of hemorrhagic fever with renal syndrome in mainland China with an ensemble machine learning approach.PLoS One. 2021 Mar 16;16(3):e0248597. doi: 10.1371/journal.pone.0248597. eCollection 2021. PLoS One. 2021. PMID: 33725011 Free PMC article.
-
Predicting inflation component drivers in Nigeria: a stacked ensemble approach.SN Bus Econ. 2023;3(1):9. doi: 10.1007/s43546-022-00384-2. Epub 2022 Dec 9. SN Bus Econ. 2023. PMID: 36531599 Free PMC article. Review.
Cited by
-
Predicting hospital outpatient volume using XGBoost: a machine learning approach.Sci Rep. 2025 May 16;15(1):17028. doi: 10.1038/s41598-025-01265-y. Sci Rep. 2025. PMID: 40379678 Free PMC article.
-
Estimating the volume of penumbra in rodents using DTI and stack-based ensemble machine learning framework.Eur Radiol Exp. 2024 May 15;8(1):59. doi: 10.1186/s41747-024-00455-z. Eur Radiol Exp. 2024. PMID: 38744784 Free PMC article.
References
-
- WHO mortality database: the number of deaths caused by cerebrovascular disease. https://platform.who.int/mortality/themes/theme-details/topics/indicator.... Accessed 3 Sep 2022.
-
- China TWC of the R on CH and D in. Report on Cardiovascular Health and Diseases in China An Updated Summary. Biomed Environ Sci. 2021;2022(35):573–603. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources