Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 1;15(1):20906.
doi: 10.1038/s41598-025-04921-5.

Prediction of cardiovascular diseases based on GBDT+LR

Affiliations

Prediction of cardiovascular diseases based on GBDT+LR

Zengxiao Chi et al. Sci Rep. .

Abstract

Currently, there are over 300 million patients with cardiovascular diseases in China. With the acceleration of population aging, the impact of cardiovascular diseases is becoming increasingly severe. Accurately and efficiently predicting the potential risks of cardiovascular disease is crucial for preventing its progression and maintaining public cardiovascular health. This article uses a combination of gradient-boosting decision trees (GBDT) and logistic regression (LR) to predict the probability of cardiovascular disease risk. To address the weak feature combination ability of LR in handling nonlinear data, a cardiovascular disease prediction model was established by integrating GBDT with LR by using the predicted results of GBDT as new features instead of the original ones and inputting them into the LR model. Using the UCI cardiovascular disease dataset, we conduct experimental comparisons between the proposed model and other common disease classification algorithms such as logistic regression (LR), random forest (RF), and support vector machine (SVM). The experimental results show that GBDT+LR outperforms other models in multiple evaluation indicators such as accuracy, precision, specificity, F1 value, MCC, AUC, and AUPR. The cardiovascular disease prediction model using the GBDT+LR algorithm has the best prediction performance. This article builds a front-end and back-end separated cardiovascular disease analysis and prediction platform based on the Spark Big data framework and Vue+SpringBoot framework, which realizes predicting cardiovascular disease risk probability.

Keywords: Cardiovascular disease; GBDT; GBDT+LR; LR; Machine learning; Prediction; Spark.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
GBDT+LR cardiovascular disease prediction flow chart.
Fig. 2
Fig. 2
Boxplot for weight.
Fig. 3
Fig. 3
Boxplot for weight data after cleaning.
Fig. 4
Fig. 4
Disease category ratio chart.
Fig. 5
Fig. 5
Correlation heatmap of features.
Fig. 6
Fig. 6
GBDT+LR feature construction.
Fig. 7
Fig. 7
GBDT constructs new feature process.
Algorithm 1
Algorithm 1
GBDT+LR.
Fig. 8
Fig. 8
Functional module diagram of cardiovascular disease prediction platform.
Fig. 9
Fig. 9
Personal prediction record view page.

References

    1. Okwuosa, I. S., Lewsey, S. C., Adesiyun, T., Blumenthal, R. S. & Yancy, C. W. Worldwide disparities in cardiovascular disease: Challenges and solutions. Int. J. Cardiol.202, 433–440. 10.1016/j.ijcard.2015.08.172 (2016). - PubMed
    1. Liu, Z. & Pan, X. Application of artificial intelligence in the prevention and treatment of cardiovascular diseases. Chin. Clin. J. Thorac. Cardiovasc. Surg.29, 1230–1235 (2022).
    1. Yang, Q., Tong, X., Schieb, L., Coronado, F. & Merritt, R. Morbidity and mortality weekly report. MMWR Morb. Mortal. Wkly Rep.72, 431–436. 10.15585/mmwr.mm7216a4 (2023). - PMC - PubMed
    1. Wang, Z. & Hu, S. Interpretation of the annual report on cardiovascular health and diseases in China 2020. Cardiovasc. Dis. Explor. (2022).
    1. Zhu, X. Research on cardiovascular disease prediction models. Master’s thesis, Changchun University of Science and Technology (2021).

LinkOut - more resources