Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 16:17:1532884.
doi: 10.3389/fnagi.2025.1532884. eCollection 2025.

Development and validation of deep learning- and ensemble learning-based biological ages in the NHANES study

Affiliations

Development and validation of deep learning- and ensemble learning-based biological ages in the NHANES study

Yushu Huang et al. Front Aging Neurosci. .

Abstract

Introduction: Conventional machine learning (ML) approaches for constructing biological age (BA) have predominantly relied on blood-based markers, limiting their scope. This study aims to develop and validate novel ML-based BA models using a comprehensive set of clinical, behavioral, and socioeconomic factors and evaluate their predictive performance for mortality.

Methods: We analyzed data from 24,985 participants in the National Health and Nutrition Examination Survey (NHANES) from 1999 to 2010, with follow-up extending to 31 December 2019, or until death or loss to follow-up. Thirty features, including blood and urine biochemistry, physical examination data, behavioral traits, and socioeconomic factors, were selected using the Least Absolute Shrinkage and Selection Operator (LASSO). These features were utilized to train deep neural networks (DNN) and ensemble learning models, specifically the Deep Biological Age (DBA) and Ensemble Biological Age (EnBA), with chronological age (CA) as the reference label. Model performance was assessed using mean absolute error (MAE), while interpretability was explored using Shapley Additive exPlanation (SHAP). Predictive accuracy of DBA and EnBA for mortality was compared with Phenotypic Age (PhenoAge) using the area under the curve (AUC) derived from Cox proportional hazards models and hazard ratios (HR), adjusted for demographics and lifestyle factors. Sensitivity analyses were performed to ensure robustness.

Results: DBA and EnBA accurately predicted actual age (MAE = 2.98 and 3.58 years, respectively) and demonstrated strong predictive capability for all-cause mortality, with AUCs of 0.896 (95% CI: 0.891-0.898) for DBA and 0.889 (95% CI: 0.884-0.894) for EnBA. Higher DBA and EnBA accelerations were significantly associated with increased mortality risk (HR = 1.059 and 1.039, respectively). SHAP analysis highlighted prescription medication usage, hepatitis B surface antibody status, and vigorous physical activity as the most influential features contributing to DBA predictions. Furthermore, BA acceleration was linked to elevated risk of death from specific chronic conditions, including cardiovascular and cerebrovascular diseases and cancer.

Discussion: Our study successfully developed and validated two ML-based BA models capable of accurately predicting both all-cause and cause-specific mortality. These findings suggest that the DBA and EnBA models hold promise for early identification of high-risk individuals, potentially facilitating timely preventive interventions and improving population health outcomes.

Keywords: aging; biological age; deep learning; deep neural networks; machine learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Three side-by-side Receiver Operating Characteristic (ROC) curve graphs labeled A, B, and C compare four models. Each graph shows sensitivity versus 1-specificity with colored lines representing each model. Graph A (DBA) features AUC values from 0.896 to 0.906, Graph B (EnBa) has AUC values from 0.889 to 0.906, and Graph C (PhenoAge V2) shows AUC values from 0.902 to 0.912.
FIGURE 1
ROC curves. Model 1 predicts mortality based on Deep Biological Age (DBA) (A), Ensemble Biological Age (EnBA) (B), Phenotypic Age Version 2 (PhenoAge V2) (C). Model 2 is based on chronological age (CA). Model 3 is based on Model 1, adjusting for CA and gender. Model 4 is based on Model 3, further adjusting for education, BMI, drinking, smoking, sleep, and physical activity.
“Graphical representation of hazard ratios for death causes in three panels (A, B, C). Each lists causes on left with HRs as purple squares and 95% CI lines. Panel A shows higher ratios than B/C.”
FIGURE 2
Forest plot. (A) DBA-Acc; (B) EnBA-Acc; (C) PhenoAge-Acc V2. The model was based on age acceleration (Age-Acc), adjusting for age, gender, education level, e Body Mass Index (BMI), smoking, drinking, sleep, and exercise status.
Panel A: SHAP summary plot of feature impacts (pink=high, blue=low). Prescription medication has largest effect. Panel B: Bar chart of average impacts, medication highest influence.
FIGURE 3
Shapley Additive exPlanation (SHAP) values on the DNNs model. (A) SHAP summary plot, (B) Feature importance ranking by mean SHAP value. Each dot represents a single participant’s SHAP value (impact on predicted biological age). Color indicates feature value (red = high, blue = low). Position on the x-axis shows whether that value contributes to increasing (positive, right) or decreasing (negative, left) biological age.

References

    1. Ahmadi M., Clare P., Katzmarzyk P., Del Pozo Cruz B., Lee I. M., Stamatakis E. (2022). Vigorous physical activity, incident heart disease, and cancer: How little is enough? Eur. Heart J. 43 4801–4814. 10.1093/eurheartj/ehac572 - DOI - PMC - PubMed
    1. Argentieri M., Amin N., Nevado-Holgado A., Sproviero W., Collister J., Keestra S., et al. (2025). Integrating the environmental and genetic architectures of aging and mortality. Nat. Med. 31 1016–1025. 10.1038/s41591-024-03483-9 - DOI - PMC - PubMed
    1. Baker G., Sprott R. (1988). Biomarkers of aging. Exp. Gerontol. 23 223–239. 10.1016/0531-5565(88)90025-3 - DOI - PubMed
    1. Bobrov E., Georgievskaya A., Kiselev K., Sevastopolsky A., Zhavoronkov A., Gurov S., et al. (2018). PhotoAgeClock: Deep learning algorithms for development of non-invasive visual biomarkers of aging. Aging 10 3249–3259. 10.18632/aging.101629 - DOI - PMC - PubMed
    1. Cabral D., Bigliassi M., Cattaneo G., Rundek T., Pascual-Leone A., Cahalin L., et al. (2022). Exploring the interplay between mechanisms of neuroplasticity and cardiovascular health in aging adults: A multiple linear regression analysis study. Auton Neurosci. 242:103023. 10.1016/j.autneu.2022.103023 - DOI - PMC - PubMed

LinkOut - more resources