Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 4;25(1):466.
doi: 10.1186/s12872-025-04928-w.

Prediction of three-year all-cause mortality in patients with heart failure and atrial fibrillation using the CatBoost model

Affiliations

Prediction of three-year all-cause mortality in patients with heart failure and atrial fibrillation using the CatBoost model

Jiacan Wu et al. BMC Cardiovasc Disord. .

Abstract

Background: Heart failure and atrial fibrillation (HF-AF) frequently coexist, resulting in complex interactions that substantially elevate mortality risk. This study aimed to develop and validate a machine learning (ML) model predicting the 3-year all-cause mortality risk in HF-AF patients to support personalized risk stratification and management.

Method: This retrospective cohort study included 558 HF-AF patients admitted in 2018, with a median follow-up duration of 1,185 days. The cohort was randomly divided into training (70%) and test (30%) sets. Feature selection utilized the Boruta algorithm and least absolute shrinkage and selection operator regression. Six ML models were trained using tenfold cross-validation and optimized via grid search. Model performance was evaluated across 12 metrics, including the area under the receiver operating characteristic curve (AUC), to identify the best-performing model. Subsequently, Shapley Additive exPlanations (SHAP) analysis was used to interpret the optimal model and investigate interactions between features.

Results: Of the 558 patients, 215 reached the primary endpoint. Feature selection identified 14 key variables for model development. The best-performing model, CatBoost, achieved the highest AUC (0.809) and demonstrated robust performance across multiple evaluation metrics. SHAP analysis highlighted the New York Heart Association (NYHA) classification, absolute lymphocyte count (ALC), high-sensitivity C-reactive protein, B-type natriuretic peptide (BNP), and age as key predictors. SHAP interaction analysis identified several feature interactions, with relatively strong ones observed between ALC and NYHA classification, and ALC and BNP.

Conclusions: CatBoost was identified as the optimal model for predicting three-year all-cause mortality in HF-AF patients, potentially aiding clinicians in risk stratification and individualized treatment planning to improve patient outcomes.

Keywords: All-cause mortality; Atrial fibrillation; Heart failure; Machine learning; Prediction model.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: The present study involves human participants and was approved by the Ethics Committee of the First Hospital of Chongqing Medical University (reference number 2020-528) and adhered to the guidelines of the Helsinki Declaration. Written informed consent was obtained from all individual participants. Consent for publication: Informed consent was obtained from all individual participants included in the study. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Flowchart of patient selection, data processing, model development, and validation. Abbreviations: HF, heart failure; AF, atrial fibrillation; HF-AF, heart failure with atrial fibrillation; LASSO, least absolute shrinkage and selection operator; CatBoost, Categorical Boosting; NN, Neural Networks; LR, logistic regression; RF, Random Forest; SVM, Support Vector Machines; GBDT, gradient boosting decision tree; AUC, area under the curve; PPV, Positive Predictive Value; NPV, Negative Predictive Value; ACC, accuracy; F1, the harmonic mean of precision and recall; MCC, Matthews Correlation Coefficient; BS, Brier Score; DCA, Decision Curve Analysis; SHAP, Shapley Additive exPlanations
Fig. 2
Fig. 2
ROC curves for each model in the training and test sets. A ROC curve in the training set, showing RF with the highest AUC (0.935), followed by GBDT (0.930), CatBoost (0.869), NN (0.801), SVM (0.794), and Lasso-LR (0.760); B ROC curve in the test set, demonstrating CatBoost with the highest AUC (0.809), followed by NN (0.802), Lasso-LR (0.793), RF (0.790), SVM (0.773), and GBDT (0.732). Abbreviations: ROC, receiver operating characteristic; AUC, area under the curve; CatBoost, Categorical Boosting; NN, Neural Networks; Lasso-LR, least absolute shrinkage and selection operator-penalized logistic regression; RF, Random Forest; SVM, Support Vector Machines; GBDT, gradient boosting decision tree
Fig. 3
Fig. 3
Evaluation metrics for each model in the training and test sets. A Evaluation metrics in the training set; B Evaluation metrics in the test set. Abbreviations: CatBoost, Categorical Boosting; NN, Neural Networks; Lasso-LR, least absolute shrinkage and selection operator-penalized logistic regression; RF, Random Forest; SVM, Support Vector Machines; GBDT, gradient boosting decision tree; PPV, Positive Predictive Value; NPV, Negative Predictive Value; ACC, accuracy; F1, the harmonic mean of precision and recall; MCC, Matthews Correlation Coefficient
Fig. 4
Fig. 4
Calibration and DCA curves for each model in the training and test sets. A Calibration curves in the training set. GBDT and RF demonstrated better calibration, with curves closest to the ideal diagonal line. CatBoost and NN showed moderate calibration, while SVM and Lasso-LR exhibited relatively lower predicted probabilities compared to other models. B Calibration curves in the test set. NN and CatBoost demonstrated better calibration, followed by RF and Lasso-LR, while GBDT and SVM showed relatively lower predicted probabilities. C Decision curve analysis in the training set. GBDT, RF, and CatBoost yielded the highest net benefit across most threshold probabilities, indicating superior clinical utility. D Decision curve analysis in the test set. All models demonstrated greater net benefit than the treat-all and treat-none strategies across a wide range of threshold probabilities. Abbreviations: CatBoost, Categorical Boosting; NN, Neural Networks; Lasso-LR, least absolute shrinkage and selection operator-penalized logistic regression; RF, Random Forest; SVM, Support Vector Machines; GBDT, gradient boosting decision tree; BS, Brier Score; DCA, Decision Curve Analysis
Fig. 5
Fig. 5
SHAP explanations for CatBoost model. A Summary plot of the SHAP values for CatBoost. Each point represents a SHAP value for a feature in an individual patient. Features are ranked by their importance based on the mean absolute SHAP values. Orange points indicate higher feature values, while blue points indicate lower values. A positive SHAP value indicates a greater contribution to predicted risk, whereas a negative value indicates a protective effect; B Ranking of feature importance based on the average absolute SHAP values. The bar plot displays the mean absolute SHAP value for each feature, reflecting its average contribution to the model’s predictions across all samples. Features with higher values have a greater impact on the output of the CatBoost model. NYHA classification, LAC, and hs-CRP are the top three most influential features. Abbreviations: SHAP, Shapley Additive exPlanations; CatBoost, Categorical Boosting; NYHA, New York Heart Association; ALC, absolute lymphocyte count; hs-CRP, high-sensitivity C-reactive protein; BNP, B-type natriuretic peptide; LVEDD, left ventricular end-diastolic dimension; BMI, body mass index; BUN, blood urea nitrogen; RAD, right atrial dimension; ALB, albumin
Fig. 6
Fig. 6
SHAP independence plot for each feature. Each plot (A-N) demonstrates how changes in feature values affect the model’s predictions, with higher SHAP values indicating a stronger impact on the outcome. The features include: A NYHA classification, B ALC, C hs-CRP, D BNP, E Age, F LVEDD, G BMI, H Anticoagulation duration, I BUN, J Anemia, K Length of stay, L RAD, M ALB, and N Digoxin. Abbreviations: SHAP, Shapley Additive Explanation; NYHA, New York Heart Abbreviations: SHAP, Shapley Additive exPlanations; NYHA, New York Heart Association; ALC, absolute lymphocyte count; hs-CRP, high-sensitivity C-reactive protein; BNP, B-type natriuretic peptide; LVEDD, left ventricular end-diastolic dimension; BMI, body mass index; BUN, blood urea nitrogen; RAD, right atrial dimension; ALB, albumin

Similar articles

References

    1. Liu Z, Li Z, Li X, Yan Y, Liu J, Wang J, et al. Global trends in heart failure from 1990 to 2019: An age-period-cohort analysis from the Global Burden of Disease study. ESC Heart Fail. 2024;11(5):3264–78. - PMC - PubMed
    1. Elliott AD, Middeldorp ME, Van Gelder IC, Albert CM, Sanders P. Epidemiology and modifiable risk factors for atrial fibrillation. Nat Rev Cardiol. 2023;20(6):404–17. - PubMed
    1. Reddy YNV, Borlaug BA, Gersh BJ. Management of Atrial Fibrillation Across the Spectrum of Heart Failure With Preserved and Reduced Ejection Fraction. Circulation. 2022;146(4):339–57. - PubMed
    1. Vermond RA, Geelhoed B, Verweij N, Tieleman RG, Van der Harst P, Hillege HL, et al. Incidence of Atrial Fibrillation and Relationship With Cardiovascular Events, Heart Failure, and Mortality: A Community-Based Study From the Netherlands. J Am Coll Cardiol. 2015;66(9):1000–7. - PubMed
    1. Carlisle MA, Fudim M, DeVore AD, Piccini JP. Heart Failure and Atrial Fibrillation, Like Fire and Fury. JACC Heart Fail. 2019;7(6):447–56. - PubMed

Publication types

MeSH terms