Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 20;12(1):1033.
doi: 10.1038/s41598-021-04649-y.

Application of ensemble machine learning algorithms on lifestyle factors and wearables for cardiovascular risk prediction

Affiliations

Application of ensemble machine learning algorithms on lifestyle factors and wearables for cardiovascular risk prediction

Weiting Huang et al. Sci Rep. .

Abstract

This study looked at novel data sources for cardiovascular risk prediction including detailed lifestyle questionnaire and continuous blood pressure monitoring, using ensemble machine learning algorithms (MLAs). The reference conventional risk score compared against was the Framingham Risk Score (FRS). The outcome variables were low or high risk based on calcium score 0 or calcium score 100 and above. Ensemble MLAs were built based on naive bayes, random forest and support vector classifier for low risk and generalized linear regression, support vector regressor and stochastic gradient descent regressor for high risk categories. MLAs were trained on 600 Southeast Asians aged 21 to 69 years free of cardiovascular disease. All MLAs outperformed the FRS for low and high-risk categories. MLA based on lifestyle questionnaire only achieved AUC of 0.715 (95% CI 0.681, 0.750) and 0.710 (95% CI 0.653, 0.766) for low and high risk respectively. Combining all groups of risk factors (lifestyle survey questionnaires, clinical blood tests, 24-h ambulatory blood pressure and heart rate monitoring) along with feature selection, prediction of low and high CVD risk groups were further enhanced to 0.791 (95% CI 0.759, 0.822) and 0.790 (95% CI 0.745, 0.836). Besides conventional predictors, self-reported physical activity, average daily heart rate, awake blood pressure variability and percentage time in diastolic hypertension were important contributors to CVD risk classification.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Modelling flow chart using ensemble MLA for cardiovascular risk prediction.
Figure 2
Figure 2
ROC curves for low risk group (left) and high risk group (right). Colours and line style represent the prediction performance for different models. Prediction performance for both low and high risk groups were significantly better in model 5* compared to FRS.
Figure 3
Figure 3
The top 15 features of MLA models showing the relative importance of the different variables in CVD risk prediction. Age, glucose, cholesterol LDL, wake period blood pressure variability, medication for BP and dyslipidemia, triglycerides and albumin reading were some common predictors across the different versions.

References

    1. Massaro, J.M., et al., General cardiovascular risk profile for use in primary care the Framingham Heart Study. 2008. - PubMed
    1. Conroy RM, et al. Estimation of ten-year risk of fatal cardiovascular disease in Europe: The SCORE project. Eur. Heart J. 2003;24(11):987–1003. - PubMed
    1. Hippisley-Cox J, et al. Predicting cardiovascular risk in England and Wales: Prospective derivation and validation of QRISK2. BMJ. 2008;336(7659):1475–1482. - PMC - PubMed
    1. Weng SF, et al. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS ONE. 2017;12(4):e0174944. - PMC - PubMed
    1. Cooney MT, Dudina AL, Graham IM. Value and limitations of existing scores for the assessment of cardiovascular risk: A review for clinicians. J. Am. Coll. Cardiol. 2009;54(14):1209–1227. - PubMed

Publication types