Explainable artificial intelligence driven insights into smoking prediction using machine learning and clinical parameters
- PMID: 40617930
- PMCID: PMC12228788
- DOI: 10.1038/s41598-025-09409-w
Explainable artificial intelligence driven insights into smoking prediction using machine learning and clinical parameters
Abstract
Smoking is a leading cause of various health conditions, including cancer and respiratory diseases. Smokers often face medical restrictions such as limitations in blood and organ donation, reduced effectiveness of medications, and increased surgical complications. These impacts underscore the need for early detection of smoking status to enable timely intervention. This study explores the use of Artificial Intelligence (AI) and Machine Learning (ML) techniques to predict smoking status based on health parameters, including biosignals and clinical biomarkers. A balanced subset of 2,000 instances was sampled from a publicly available Kaggle dataset comprising clinical and biometric features. Multiple ML models were implemented, including Random Forest Classifier, Logistic Regression, Decision Tree Classifier, K-Nearest Neighbors, CatBoost Classifier, and an Artificial Neural Network. The Random Forest Classifier achieved the better performance with an accuracy of 0.80, precision of 0.80, recall of 0.80, and F1-score of 0.79. To enhance model interpretability, four Explainable Artificial Intelligence (XAI) techniques were applied: Shapley Additive Explanations (SHAP), Local Interpretable Model-Agnostic Explanations (LIME), QLattice, and Anchor. SHAP identified hemoglobin as the most influential predictor, while LIME, QLattice, and Anchor highlighted the role of gamma-glutamyl transferase (t). Interactions between hemoglobin, GTP, and height were associated with more accurate predictions. The integration of ensemble modeling and multiple XAI approaches offers deeper interpretability than prior studies, providing healthcare providers and policymakers with a robust, transparent decision-support tool for targeted intervention strategies.
Keywords: Artificial intelligence; Health parameters; Machine learning; Smokers detection; XAI.
Conflict of interest statement
Declarations. Competing interests: The authors declare no competing interests.
Figures















Similar articles
-
Advancing personalized healthcare: leveraging explainable AI for BPPV risk assessment.Health Inf Sci Syst. 2024 Nov 24;13(1):1. doi: 10.1007/s13755-024-00317-3. eCollection 2025 Dec. Health Inf Sci Syst. 2024. PMID: 39606094
-
Understanding machine learning predictions of wastewater treatment plant sludge with explainable artificial intelligence.Water Environ Res. 2024 Oct;96(10):e11136. doi: 10.1002/wer.11136. Water Environ Res. 2024. PMID: 39322560
-
Beyond black-box models: explainable AI for embryo ploidy prediction and patient-centric consultation.J Assist Reprod Genet. 2024 Sep;41(9):2349-2358. doi: 10.1007/s10815-024-03178-7. Epub 2024 Jul 4. J Assist Reprod Genet. 2024. PMID: 38963605
-
Prediction of disease comorbidity using explainable artificial intelligence and machine learning techniques: A systematic review.Int J Med Inform. 2023 Jul;175:105088. doi: 10.1016/j.ijmedinf.2023.105088. Epub 2023 May 4. Int J Med Inform. 2023. PMID: 37156169
-
The enlightening role of explainable artificial intelligence in medical & healthcare domains: A systematic literature review.Comput Biol Med. 2023 Nov;166:107555. doi: 10.1016/j.compbiomed.2023.107555. Epub 2023 Oct 4. Comput Biol Med. 2023. PMID: 37806061
References
-
- Vásconez-González, J. et al. Effects of smoking marijuana on the respiratory system: a systematic review. Subst. Abus.44 (3), 249–260 (2023). - PubMed
LinkOut - more resources
Full Text Sources
Research Materials