Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct 17;5(1):30-40.
doi: 10.1093/ehjdh/ztad058. eCollection 2024 Jan.

Improving cardiovascular risk prediction through machine learning modelling of irregularly repeated electronic health records

Affiliations

Improving cardiovascular risk prediction through machine learning modelling of irregularly repeated electronic health records

Chaiquan Li et al. Eur Heart J Digit Health. .

Abstract

Aims: Existing electronic health records (EHRs) often consist of abundant but irregular longitudinal measurements of risk factors. In this study, we aim to leverage such data to improve the risk prediction of atherosclerotic cardiovascular disease (ASCVD) by applying machine learning (ML) algorithms, which can allow automatic screening of the population.

Methods and results: A total of 215 744 Chinese adults aged between 40 and 79 without a history of cardiovascular disease were included (6081 cases) from an EHR-based longitudinal cohort study. To allow interpretability of the model, the predictors of demographic characteristics, medication treatment, and repeatedly measured records of lipids, glycaemia, obesity, blood pressure, and renal function were used. The primary outcome was ASCVD, defined as non-fatal acute myocardial infarction, coronary heart disease death, or fatal and non-fatal stroke. The eXtreme Gradient boosting (XGBoost) algorithm and Least Absolute Shrinkage and Selection Operator (LASSO) regression models were derived to predict the 5-year ASCVD risk. In the validation set, compared with the refitted Chinese guideline-recommended Cox model (i.e. the China-PAR), the XGBoost model had a significantly higher C-statistic of 0.792, (the differences in the C-statistics: 0.011, 0.006-0.017, P < 0.001), with similar results reported for LASSO regression (the differences in the C-statistics: 0.008, 0.005-0.011, P < 0.001). The XGBoost model demonstrated the best calibration performance (men: Dx = 0.598, P = 0.75; women: Dx = 1.867, P = 0.08). Moreover, the risk distribution of the ML algorithms differed from that of the conventional model. The net reclassification improvement rates of XGBoost and LASSO over the Cox model were 3.9% (1.4-6.4%) and 2.8% (0.7-4.9%), respectively.

Conclusion: Machine learning algorithms with irregular, repeated real-world data could improve cardiovascular risk prediction. They demonstrated significantly better performance for reclassification to identify the high-risk population correctly.

Keywords: Prediction; Preventive Cardiology; Risk.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest: P.G. reported receiving research funds from Bayer and Merck. These funding sources had no relation to this study. All other authors have reported that they have no relationships relevant to the contents of this paper to disclose.

Figures

Graphical Abstract
Graphical Abstract
Figure 1
Figure 1
The study design and categories of predictors. (A) The cohort design of the study; (B) predictors of seven aetiological categories included in different approaches.
Figure 2
Figure 2
The difference in C-statistics scores compared with the refitted China-PAR model. The results are given based on the validation set of 31 544.
Figure 3
Figure 3
Calibration plots of different models by sexa. The results are given based on the validation set of 31 544.
Figure 4
Figure 4
Distribution of predicted risk given by the XGBoost model and the refitted China-PAR model in the validation set.

References

    1. Arnett DK, Blumenthal RS, Albert MA, Buroker AB, Goldberger ZD, Hahn EJ, et al. 2019 ACC/AHA guideline on the primary prevention of cardiovascular disease: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. J Am Coll Cardiol 2019;74:e177–e232. - PMC - PubMed
    1. Visseren FL, Mach F, Smulders YM, Carballo D, Koskinas KC, Bäck M, et al. 2021 ESC guidelines on cardiovascular disease prevention in clinical practice: developed by the task force for cardiovascular disease prevention in clinical practice with representatives of the European Society of Cardiology and 12 medical societies with the special contribution of the European Association of Preventive Cardiology (EAPC). Eur Heart J 2021;42:3227–3337.
    1. Gu D. Guideline on the assessment and management of cardiovascular risk in China. Chin J Prev Med 2019;53:13–34. - PubMed
    1. Kist JM, Vos RC, Mairuhu ATA, Struijs JN, van Peet PG, Vos HMM, et al. SCORE2 cardiovascular risk prediction models in an ethnic and socioeconomic diverse population in the Netherlands: an external validation study. EClinicalMedicine 2023;57:101862. - PMC - PubMed
    1. Muntner P, Colantonio LD, Cushman M, Goff DC, Howard G, Howard VJ, et al. Validation of the atherosclerotic cardiovascular disease Pooled Cohort risk equations. JAMA 2014;311:1406–1415. - PMC - PubMed

LinkOut - more resources