Predicting Self-Rated Health Across the Life Course: Health Equity Insights from Machine Learning Models
- PMID: 33620624
- PMCID: PMC8131482
- DOI: 10.1007/s11606-020-06438-1
Predicting Self-Rated Health Across the Life Course: Health Equity Insights from Machine Learning Models
Abstract
Background: Self-rated health is a strong predictor of mortality and morbidity. Machine learning techniques may provide insights into which of the multifaceted contributors to self-rated health are key drivers in diverse groups.
Objective: We used machine learning algorithms to predict self-rated health in diverse groups in the Behavioral Risk Factor Surveillance System (BRFSS), to understand how machine learning algorithms might be used explicitly to examine drivers of self-rated health in diverse populations.
Design: We applied three common machine learning algorithms to predict self-rated health in the 2017 BRFSS survey, stratified by age, race/ethnicity, and sex. We replicated our process in the 2016 BRFSS survey.
Participants: We analyzed data from 449,492 adult participants of the 2017 BRFSS survey.
Main measures: We examined area under the curve (AUC) statistics to examine model fit within each group. We used traditional logistic regression to predict self-rated health associated with features identified by machine learning models.
Key results: Each algorithm, regularized logistic regression (AUC: 0.81), random forest (AUC: 0.80), and support vector machine (AUC: 0.81), provided good model fit in the BRFSS. Predictors of self-rated health were similar by sex and race/ethnicity but differed by age. Socioeconomic features were prominent predictors of self-rated health in mid-life age groups. Income [OR: 1.70 (95% CI: 1.62-1.80)], education [OR: 2.02 (95% CI: 1.89, 2.16)], physical activity [OR: 1.52 (95% CI: 1.46-1.58)], depression [OR: 0.66 (95% CI: 0.63-0.68)], difficulty concentrating [OR: 0.62 (95% CI: 0.58-0.66)], and hypertension [OR: 0.59 (95% CI: 0.57-0.61)] all predicted the odds of excellent or very good self-rated health.
Conclusions: Our analysis of BRFSS data show social determinants of health are prominent predictors of self-rated health in mid-life. Our work may demonstrate promising practices for using machine learning to advance health equity.
Keywords: healthcare disparities; machine learning; self-rated health; social determinants of health; socioeconomic factors.
Conflict of interest statement
The authors declare that they do not have a conflict of interest.
Figures


Comment in
-
Health Equity Insights from Machine Learning Models.J Gen Intern Med. 2021 Aug;36(8):2475. doi: 10.1007/s11606-021-06908-0. Epub 2021 May 19. J Gen Intern Med. 2021. PMID: 34013468 Free PMC article. No abstract available.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources