Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May 28:15:1390729.
doi: 10.3389/fendo.2024.1390729. eCollection 2024.

Machine learning model for cardiovascular disease prediction in patients with chronic kidney disease

Affiliations

Machine learning model for cardiovascular disease prediction in patients with chronic kidney disease

He Zhu et al. Front Endocrinol (Lausanne). .

Abstract

Introduction: Cardiovascular disease (CVD) is the leading cause of death in patients with chronic kidney disease (CKD). This study aimed to develop CVD risk prediction models using machine learning to support clinical decision making and improve patient prognosis.

Methods: Electronic medical records from patients with CKD at a single center from 2015 to 2020 were used to develop machine learning models for the prediction of CVD. Least absolute shrinkage and selection operator (LASSO) regression was used to select important features predicting the risk of developing CVD. Seven machine learning classification algorithms were used to build models, which were evaluated by receiver operating characteristic curves, accuracy, sensitivity, specificity, and F1-score, and Shapley Additive explanations was used to interpret the model results. CVD was defined as composite cardiovascular events including coronary heart disease (coronary artery disease, myocardial infarction, angina pectoris, and coronary artery revascularization), cerebrovascular disease (hemorrhagic stroke and ischemic stroke), deaths from all causes (cardiovascular deaths, non-cardiovascular deaths, unknown cause of death), congestive heart failure, and peripheral artery disease (aortic aneurysm, aortic or other peripheral arterial revascularization). A cardiovascular event was a composite outcome of multiple cardiovascular events, as determined by reviewing medical records.

Results: This study included 8,894 patients with CKD, with a composite CVD event incidence of 25.9%; a total of 2,304 patients reached this outcome. LASSO regression identified eight important features for predicting the risk of CKD developing into CVD: age, history of hypertension, sex, antiplatelet drugs, high-density lipoprotein, sodium ions, 24-h urinary protein, and estimated glomerular filtration rate. The model developed using Extreme Gradient Boosting in the test set had an area under the curve of 0.89, outperforming the other models, indicating that it had the best CVD predictive performance.

Conclusion: This study established a CVD risk prediction model for patients with CKD, based on routine clinical diagnostic and treatment data, with good predictive accuracy. This model is expected to provide a scientific basis for the management and treatment of patients with CKD.

Keywords: cardiovascular disease; chronic kidney disease; electronic medical records; machine learning; prediction model.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Flowchart of participant screening. CKD, chronic kidney diseases; CVD, cardiovascular disease.
Figure 2
Figure 2
Features selection using the LASSO binomial regression model. LASSO, least absolute shrinkage and selection operator. (A) The partial likelihood deviance (binomial deviance) curve was plotted versus log (lambda).LASSO coefficient profiles of the 31 baseline features. (B) Tuning parameter (A) selection in the LASSO model used 5-fold cross validation via minimum criteria variable selection. LASSO coefficient profiles of the 8 features.
Figure 3
Figure 3
Performance of 7 types of predicting models of training dataset (A) and testing dataset (B); SVM, support vector machine; Log Reg, logistic regression; XGBoost, extreme gradient boosting; KNN, k-nearest neighbor neighbor; NB, naïve Bayesian; RF, Random Forest; BPNN, Backpropagation Neural Network.
Figure 4
Figure 4
(A) SHAP summary plot in XGBoost model with 8 variables. (B) A importance matrix plot of the XGBoost.
Figure 5
Figure 5
SHAP dependence plot of the XGBoost model.
Figure 6
Figure 6
The confusion matrix of the XGBoost model predictions.

Similar articles

Cited by

References

    1. Mills KT, Xu Y, Zhang W, Bundy JD, Chen CS, Kelly TN, et al. . A systematic analysis of worldwide population-based data on the global burden of chronic kidney disease in 2010. Kidney Int. (2015) 88:950–7. doi: 10.1038/ki.2015.230 - DOI - PMC - PubMed
    1. Matsushita K, Ballew SH, Wang AY, Kalyesubula R, Schaeffner E, Agarwal R. Epidemiology and risk of cardiovascular disease in populations with chronic kidney disease. Nat Rev Nephrol. (2022) 18:696–707. doi: 10.1038/s41581-022-00616-6 - DOI - PubMed
    1. Provenzano M, Coppolino G, Faga T, Garofalo C, Serra R, Andreucci M. Epidemiology of cardiovascular risk in chronic kidney disease patients: the real silent killer. Rev Cardiovasc Med. (2019) 20:209–20. doi: 10.31083/j.rcm.2019.04.548 - DOI - PubMed
    1. Bertomeu-González V, Soriano Maldonado C, Bleda-Cano J, Carrascosa-Gonzalvo S, Navarro-Perez J, López-Pineda A, et al. . Predictive validity of the risk SCORE model in a Mediterranean population with dyslipidemia. Atherosclerosis. (2019) 290:80–6. doi: 10.1016/j.atherosclerosis.2019.09.007 - DOI - PubMed
    1. Lerner B, Desrochers S, Tangri N. Risk prediction models in CKD. Semin Nephrol. (2017) 37:144–50. doi: 10.1016/j.semnephrol.2016.12.004 - DOI - PubMed