COVID-19 risk stratification among older adults: a machine learning approach to identify personal and health-related risk factors
- PMID: 40730988
- PMCID: PMC12306135
- DOI: 10.1186/s12889-025-23862-2
COVID-19 risk stratification among older adults: a machine learning approach to identify personal and health-related risk factors
Abstract
Background: The COVID-19 pandemic highlighted the need to understand factors influencing individuals' risk perceptions and health behaviors. This study aimed to explore the roles of individuals' knowledge, perception, and health-related issues in determining COVID-19 risk by developing a predictive model for classifying individuals into the risk categories, incorporating both clustering and model interpretation techniques.
Methods: To identify distinct COVID-19 risk groups, clustering analysis was applied using the demographic, health, and behavioral data. Subsequently, several machine learning models-including CatBoost, XGBoost, Random Forest, Generalized Linear Model (GLM), Decision Tree, H2O Deep Neural Network (DNN), and L2 SVM-were used to predict risk classifications. SHAP (SHapley Additive exPlanations) analysis was applied to interpret the contribution of individual features in model predictions.
Results: Three distinct risk classes were identified: Class 0 (high knowledge, low-risk factors, no household COVID-19 diagnosis), Class 1 (health-related issues (e.g., hypertension), low lnowldge), and Class 2 (high knowledge, higher health risks (e.g., hypertension, household COVID-19 diagnosis)). L2 SVM achieved the highest accuracy (0.9724), followed by XGBoost (0.9301) and CatBoost (0.9265). SHAP analysis revealed that household hygiene practices and health-related issues, such as hypertension and Gastrointestinal symptoms were key drivers of risk classification.
Conclusion: Integrating individuals' knowledge, perception, and health-related issues into COVID-19 risk assessments enhances predictive accuracy. Public health policies should focus on both physical and psychological factors to effectively mitigate the spread and impact of COVID-19. Data-driven models may inform future efforts to prioritize resource allocation and improve public health responses for vulnerable populations.
Keywords: COVID-19; Health behavior; Machine learning; Perception; Predictive learning models.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Ethics approval and consent to participate: All procedures were performed in accordance with the Declaration of Helsinki and have been approved by the ethics committee of Iran University of Medical Sciences (IR.IUMS.REC.1399.1310). Written informed consent was obtained from all participants before their involvement. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.
Figures
Similar articles
-
Machine Learning Model for Predicting Coronary Heart Disease Risk: Development and Validation Using Insights From a Japanese Population-Based Study.JMIR Cardio. 2025 May 12;9:e68066. doi: 10.2196/68066. JMIR Cardio. 2025. PMID: 40354648 Free PMC article.
-
Application of machine learning algorithms to model predictors of informed contraceptive choice among reproductive age women in six high fertility rate sub Sahara Africa countries.BMC Public Health. 2025 May 29;25(1):1986. doi: 10.1186/s12889-025-23242-w. BMC Public Health. 2025. PMID: 40442626 Free PMC article.
-
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3. Cochrane Database Syst Rev. 2022. PMID: 35593186 Free PMC article.
-
A Responsible Framework for Assessing, Selecting, and Explaining Machine Learning Models in Cardiovascular Disease Outcomes Among People With Type 2 Diabetes: Methodology and Validation Study.JMIR Med Inform. 2025 Jun 27;13:e66200. doi: 10.2196/66200. JMIR Med Inform. 2025. PMID: 40577645 Free PMC article.
-
Measures implemented in the school setting to contain the COVID-19 pandemic.Cochrane Database Syst Rev. 2022 Jan 17;1(1):CD015029. doi: 10.1002/14651858.CD015029. Cochrane Database Syst Rev. 2022. Update in: Cochrane Database Syst Rev. 2024 May 2;5:CD015029. doi: 10.1002/14651858.CD015029.pub2. PMID: 35037252 Free PMC article. Updated.
References
-
- Tang F, Feng Y, Chiheb H, Fan J. The interplay of demographic variables and social distancing scores in deep prediction of U.S. COVID-19 cases. ArXiv. 2021. 10.1080/07350015.2020.1798241.
MeSH terms
LinkOut - more resources
Full Text Sources
Medical