Using Machine Learning to Predict Weight Gain in Adults: an Observational Analysis From the All of Us Research Program
- PMID: 39742657
- PMCID: PMC11911080
- DOI: 10.1016/j.jss.2024.11.042
Using Machine Learning to Predict Weight Gain in Adults: an Observational Analysis From the All of Us Research Program
Abstract
Introduction: Obesity, defined as a body mass index ≥30 kg/m2, is a major public health concern in the United States. Preventative approaches are essential, but they are limited by an inability to accurately predict individuals at highest risk of weight gain. Our objective was to develop accurate weight gain prediction models using the National Institutes of Health All of Us dataset. We hypothesized that machine learning models using both electronic health record and behavioral survey data would outperform models using electronic health record data alone.
Methods: The All of Us dataset was used to identify adults between 18 and 70 ys old with weight measurements 2 y apart between 2008 and 2022. Patients with a history of cancer, bariatric surgery, or pregnancy were excluded. Demographics, vital signs, laboratory results, comorbidities, and survey data (Alcohol Use Disorder Identification Test, Patient-Reported Outcomes Measurement Information System physical and mental health scores) were included as model parameters. Elastic net and XGBoost machine learning models were developed with and without survey data to predict ≥10% total body weight gain within 2 y. The data were split into a training sample (60%) and a testing sample (40%), and parameters were tuned using 10-fold cross-validation. Performance was compared using area under the receiver operating characteristic curves (AUCs).
Results: Our cohort consisted of 34,715 patients (mean [SD] age 50.9 [13.4] y; 45.7% White; 55.3% female). Over a 2-y span, 10.4% of the cohort gained ≥10% total body weight. AUCs were 0.677 [95% DeLong confidence interval 0.665-0.688] for elastic net and 0.706 [0.695-0.717] for XGBoost. Incorporation of survey data did not improve predictability, with AUCs of 0.681 [0.669-0.692] and 0.705 [0.694-0.716], respectively.
Conclusions: Our machine learning weight gain prediction models had modest performance that was not improved by survey data. The addition of other All of Us variables, including genomic data, may be informative in future studies.
Keywords: Adult; All of us; Machine learning; Models; Obesity; Prediction; Weight gain.
Published by Elsevier Inc.
References
-
- Pantalone KM, Hobbs TM, Chagin KM, et al. Prevalence and recognition of obesity and its associated comorbidities: cross-sectional analysis of electronic health record data from a large US integrated health system. BMJ Open. Nov 16 2017;7(11):e017583. doi: 10.1136/bmjopen-2017-017583 - DOI - PMC - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical
