Construction of machine learning diagnostic models for cardiovascular pan-disease based on blood routine and biochemical detection data
- PMID: 39342281
- PMCID: PMC11439295
- DOI: 10.1186/s12933-024-02439-0
Construction of machine learning diagnostic models for cardiovascular pan-disease based on blood routine and biochemical detection data
Abstract
Background: Cardiovascular disease, also known as circulation system disease, remains the leading cause of morbidity and mortality worldwide. Traditional methods for diagnosing cardiovascular disease are often expensive and time-consuming. So the purpose of this study is to construct machine learning models for the diagnosis of cardiovascular diseases using easily accessible blood routine and biochemical detection data and explore the unique hematologic features of cardiovascular diseases, including some metabolic indicators.
Methods: After the data preprocessing, 25,794 healthy people and 32,822 circulation system disease patients with the blood routine and biochemical detection data were utilized for our study. We selected logistic regression, random forest, support vector machine, eXtreme Gradient Boosting (XGBoost), and deep neural network to construct models. Finally, the SHAP algorithm was used to interpret models.
Results: The circulation system disease prediction model constructed by XGBoost possessed the best performance (AUC: 0.9921 (0.9911-0.9930); Acc: 0.9618 (0.9588-0.9645); Sn: 0.9690 (0.9655-0.9723); Sp: 0.9526 (0.9477-0.9572); PPV: 0.9631 (0.9592-0.9668); NPV: 0.9600 (0.9556-0.9644); MCC: 0.9224 (0.9165-0.9279); F1 score: 0.9661 (0.9634-0.9686)). Most models of distinguishing various circulation system diseases also had good performance, the model performance of distinguishing dilated cardiomyopathy from other circulation system diseases was the best (AUC: 0.9267 (0.8663-0.9752)). The model interpretation by the SHAP algorithm indicated features from biochemical detection made major contributions to predicting circulation system disease, such as potassium (K), total protein (TP), albumin (ALB), and indirect bilirubin (NBIL). But for models of distinguishing various circulation system diseases, we found that red blood cell count (RBC), K, direct bilirubin (DBIL), and glucose (GLU) were the top 4 features subdividing various circulation system diseases.
Conclusions: The present study constructed multiple models using 50 features from the blood routine and biochemical detection data for the diagnosis of various circulation system diseases. At the same time, the unique hematologic features of various circulation system diseases, including some metabolic-related indicators, were also explored. This cost-effective work will benefit more people and help diagnose and prevent circulation system diseases.
Keywords: Biochemical detection; Blood routine; Cardiovascular disease; Circulation system disease; Machine learning; Metabolic indicator.
© 2024. The Author(s).
Conflict of interest statement
The authors declare no competing interests.
Figures





Similar articles
-
Machine learning models based on routine blood and biochemical test data for diagnosis of neurological diseases.Sci Rep. 2025 Jul 30;15(1):27857. doi: 10.1038/s41598-025-09439-4. Sci Rep. 2025. PMID: 40739302 Free PMC article.
-
[Constructing a predictive model for the death risk of patients with septic shock based on supervised machine learning algorithms].Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2024 Apr;36(4):345-352. doi: 10.3760/cma.j.cn121430-20230930-00832. Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2024. PMID: 38813626 Chinese.
-
Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251. Clin Orthop Relat Res. 2020. PMID: 32282466 Free PMC article.
-
Optimizing heart disease diagnosis with advanced machine learning models: a comparison of predictive performance.BMC Cardiovasc Disord. 2025 Mar 22;25(1):212. doi: 10.1186/s12872-025-04627-6. BMC Cardiovasc Disord. 2025. PMID: 40121395 Free PMC article.
-
Does machine learning have a high performance to predict obesity among adults and older adults? A systematic review and meta-analysis.Nutr Metab Cardiovasc Dis. 2024 Sep;34(9):2034-2045. doi: 10.1016/j.numecd.2024.05.020. Epub 2024 May 29. Nutr Metab Cardiovasc Dis. 2024. PMID: 39004592
Cited by
-
Circulating mitochondrial DNA signature in cardiometabolic patients.Cardiovasc Diabetol. 2025 Mar 5;24(1):106. doi: 10.1186/s12933-025-02656-1. Cardiovasc Diabetol. 2025. PMID: 40045401 Free PMC article.
-
Predicting Visual Acuity after Retinal Vein Occlusion Anti-VEGF Treatment: Development and Validation of an Interpretable Machine Learning Model.J Med Syst. 2025 Apr 29;49(1):57. doi: 10.1007/s10916-025-02190-3. J Med Syst. 2025. PMID: 40299116
-
Application of machine learning for the analysis of peripheral blood biomarkers in oral mucosal diseases: a cross-sectional study.BMC Oral Health. 2025 May 10;25(1):703. doi: 10.1186/s12903-025-06095-y. BMC Oral Health. 2025. PMID: 40348983 Free PMC article.
-
Development and validation of an integrated prognostic model for all-cause mortality in heart failure: a comprehensive analysis combining clinical, electrocardiographic, and echocardiographic parameters.BMC Cardiovasc Disord. 2025 Mar 26;25(1):221. doi: 10.1186/s12872-025-04642-7. BMC Cardiovasc Disord. 2025. PMID: 40140751 Free PMC article.
-
A potential XGBoost Diagnostic Score for Staphylococcus aureus bloodstream infection.Front Immunol. 2025 Apr 22;16:1574003. doi: 10.3389/fimmu.2025.1574003. eCollection 2025. Front Immunol. 2025. PMID: 40330459 Free PMC article.
References
-
- Cheng X, Manandhar I, Aryal S, et al. Application of artificial intelligence in cardiovascular medicine. Compr Physiol. 2021;11(4):2455–66. - PubMed
-
- Lindstrom M, DeCleene N, Dorsey H, et al. Global burden of cardiovascular diseases and risks collaboration, 1990–2021. J Am Coll Cardiol. 2022;80(25):2372–425. - PubMed
-
- The W. Report on cardiovascular health and diseases in China 2022: an updated summary. Biomed Environ Sci. 2023;36(8):669–701. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous