Cross-sectional study on smoking types and stroke risk: development of a predictive model for identifying stroke risk
- PMID: 40196720
- PMCID: PMC11973365
- DOI: 10.3389/fphys.2025.1528910
Cross-sectional study on smoking types and stroke risk: development of a predictive model for identifying stroke risk
Abstract
Background: Stroke, a major global health concern, is responsible for high mortality and long-term disabilities. With the aging population and increasing prevalence of risk factors, its incidence is on the rise. Existing risk assessment tools have limitations, and there is a pressing need for more accurate and personalized stroke risk prediction models. Smoking, a significant modifiable risk factor, has not been comprehensively examined in current models regarding different smoking types.
Methods: Data were sourced from the 2015-2018 National Health and Nutrition Examination Survey (NHANES) and the 2020-2021 Behavioral Risk Factor Surveillance System (BRFSS). Tobacco use (including combustible cigarettes and e-cigarettes) and stroke history were obtained through questionnaires. Participants were divided into four subgroups: non-smokers, exclusive combustible cigarette users, exclusive e-cigarette users, and dual users. Covariates such as age, sex, race, education, and health conditions were also collected. Multivariate logistic regression was used to analyze the relationship between smoking and stroke. Four machine-learning models (XGBoost, logistic regression, Random Forest, and Gaussian Naive Bayes) were evaluated using the area under the receiver-operating characteristic curve (AUC), and Shapley's additive interpretation method was applied for feature importance ranking and model interpretation.
Results: A total of 273,028 individuals were included in the study. Exclusive combustible cigarette users had an elevated stroke risk (β: 1.36, 95% CI: 1.26-1.47, P < 0.0001). Among the four machine-learning models, the XGBoost model showed the best discriminative ability with an AUC of 0.794 (95% CI = 0.787-0.802).
Conclusion: This study reveals a significant association between smoking types and stroke risk. An XGBoost-based stroke prediction model was established, which has the potential to improve the accuracy of stroke risk assessment and contribute to personalized interventions for stroke prevention, thus alleviating the healthcare burden related to stroke.
Keywords: Shap; XGBoost; machine learning; prediction model; stroke.
Copyright © 2025 Ding, Yuan, Cheng and Wen.
Conflict of interest statement
Author MY was employed by Spring Airlines Co,.Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures





Similar articles
-
The association between combustible/electronic cigarette use and stroke based on national health and nutrition examination survey.BMC Public Health. 2023 Apr 14;23(1):697. doi: 10.1186/s12889-023-15371-x. BMC Public Health. 2023. PMID: 37059973 Free PMC article.
-
The association between e-cigarette use and asthma among never combustible cigarette smokers: behavioral risk factor surveillance system (BRFSS) 2016 & 2017.BMC Pulm Med. 2019 Oct 16;19(1):180. doi: 10.1186/s12890-019-0950-3. BMC Pulm Med. 2019. PMID: 31619218 Free PMC article.
-
Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms.Front Cardiovasc Med. 2025 Jan 9;11:1504957. doi: 10.3389/fcvm.2024.1504957. eCollection 2024. Front Cardiovasc Med. 2025. PMID: 39850379 Free PMC article.
-
Electronic cigarette use and cigarette smoking associated with inadequate sleep duration among U.S. young adults.Prev Med. 2023 Oct;175:107712. doi: 10.1016/j.ypmed.2023.107712. Epub 2023 Sep 25. Prev Med. 2023. PMID: 37758124 Free PMC article.
-
Machine learning-based predictive models for perioperative major adverse cardiovascular events in patients with stable coronary artery disease undergoing noncardiac surgery.Comput Methods Programs Biomed. 2025 Mar;260:108561. doi: 10.1016/j.cmpb.2024.108561. Epub 2024 Dec 13. Comput Methods Programs Biomed. 2025. PMID: 39708562
Cited by
-
Predictive value of patent foramen ovale diameter for cryptogenic stroke and age-related differences.Front Cardiovasc Med. 2025 Aug 21;12:1647313. doi: 10.3389/fcvm.2025.1647313. eCollection 2025. Front Cardiovasc Med. 2025. PMID: 40918181 Free PMC article.
References
-
- Crotty Alexander L. E., Drummond C. A., Hepokoski M., Mathew D., Moshensky A., Willeford A., et al. (2018). Chronic inhalation of e-cigarette vapor containing nicotine disrupts airway barrier function and induces systemic inflammation and multiorgan fibrosis in mice. Am. J. Physiol. Regul. Integr. Comp. Physiol. 314 (6), R834-R847–r847. 10.1152/ajpregu.00270.2017 - DOI - PMC - PubMed
LinkOut - more resources
Full Text Sources