Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2026 Mar 1:446:134075.
doi: 10.1016/j.ijcard.2025.134075. Epub 2025 Dec 11.

Developing explainable machine learning models from biochemical and clinical data to predict all-cause and cause-specific mortality in CVD-cancer comorbidity: A longitudinal study based on NHANES

Affiliations
Free article

Developing explainable machine learning models from biochemical and clinical data to predict all-cause and cause-specific mortality in CVD-cancer comorbidity: A longitudinal study based on NHANES

Lu Chai et al. Int J Cardiol. .
Free article

Abstract

Background: Cardiovascular disease (CVD) and cancer are leading causes of mortality, often coexisting in aging populations. Patients with comorbidities face synergistically increased risks, yet accurate and interpretable prediction tools remain limited. Conventional Cox proportional hazards (Cox PH) models cannot fully capture nonlinear biochemical marker interactions, restricting predictive utility.

Objective: Develop interpretable machine learning (ML) models predicting all-cause, CVD-specific, and cancer-specific mortality in U.S. adults with comorbid CVD and cancer using routine biochemical profiles.

Methods: We analyzed 10 National Health and Nutrition Examination Survey (NHANES) cycles (1999-2018; N = 1094). Twenty-one biochemical markers and clinical covariates were screened via random survival forests (RSF). Cox PH, Cox model with elastic net regularization (Cox Net), gradient boosting, extreme survival trees (EST), and RSF were compared using time-dependent AUC, C-index, Brier score with 10-fold cross-validation and bootstrapping. SHapley Additive exPlanations (SHAP) quantified feature contributions.

Results: RSF consistently outperformed other models. Test-set C-indices were 0.729 (95 % CI: 0.716-0.741) for all-cause, 0.731 (0.704-0.753) for CVD, and 0.674 (0.557-0.684) for cancer mortality. RSF achieved the lowest Brier scores (all-cause: 0.175; CVD: 0.152; cancer: 0.237), indicating superior calibration. Pairwise testing showed RSF significantly outperformed Cox PH and Cox Net for cancer mortality (P < 0.05). SHAP identified age, red cell distribution width, creatinine, and albumin as key predictors, reflecting pathways of inflammation, renal dysfunction, and metabolic dysregulation. RSF maintained moderate precision-recall performance in imbalanced outcomes.

Conclusions: RSF outperformed conventional models by capturing nonlinear interactions while interpretable. This framework supports risk stratification for CVD-cancer comorbidity, highlighting clinical value of explainable ML in precision medicine.

Keywords: Cancer; Cardiovascular disease; Machine learning; Random survival forest; SHAP analysis.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest None declared.

MeSH terms