Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2017 Jun 14;38(23):1805-1814.
doi: 10.1093/eurheartj/ehw302.

Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges

Affiliations
Review

Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges

Benjamin A Goldstein et al. Eur Heart J. .

Abstract

Risk prediction plays an important role in clinical cardiology research. Traditionally, most risk models have been based on regression models. While useful and robust, these statistical methods are limited to using a small number of predictors which operate in the same way on everyone, and uniformly throughout their range. The purpose of this review is to illustrate the use of machine-learning methods for development of risk prediction models. Typically presented as black box approaches, most machine-learning methods are aimed at solving particular challenges that arise in data analysis that are not well addressed by typical regression approaches. To illustrate these challenges, as well as how different methods can address them, we consider trying to predicting mortality after diagnosis of acute myocardial infarction. We use data derived from our institution's electronic health record and abstract data on 13 regularly measured laboratory markers. We walk through different challenges that arise in modelling these data and then introduce different machine-learning approaches. Finally, we discuss general issues in the application of machine-learning methods including tuning parameters, loss functions, variable importance, and missing data. Overall, this review serves as an introduction for those working on risk modelling to approach the diffuse field of machine learning.

Keywords: Electronic health records; Personalized medicine; Precision medicine; Risk prediction.

PubMed Disclaimer

Figures

Figure 1
Figure 1
One perspective on the intersection of statistical modelling (blue) and machine-learning (green) goals. The figure highlights that while the processes differ the overarching goals are often the same.
Figure 2
Figure 2
Observed non-linearities in the predicted probabilities of death for calcium (A) and haemoglobin (B) on the logistic (i.e. linear) and probability (C/D) scales. Both show a sharp decreasing relationship with death at lower values, which levels off at more moderate values. Models were estimated using cubic splines and were adjusted for age, sex, and race. 95% confidence bands are shown.
Figure 3
Figure 3
Classification and regression trees (A) for predicting acute myocardial infarction. The first split is on minimum CO2. Splits on different sides of the tree (creatinine on left, sodium on right) indicate potential interactions. Interaction plot (B) between CO2 and creatinine highlights the differential relationship.
Figure 4
Figure 4
High-level overview of process of applying machine-learning routines to data.

Similar articles

Cited by

References

    1. Stone NJ, Robinson JG, Lichtenstein AH, Bairey Merz CN, Blum CB, Eckel RH, Goldberg AC, Gordon D, Levy D, Lloyd-Jones DM, McBride P, Schwartz JS, Shero ST, Smith SC, Watson K, Wilson PWF, American College of Cardiology/American Heart Association Task Force on Practice Guidelines. 2013 ACC/AHA guideline on the treatment of blood cholesterol to reduce atherosclerotic cardiovascular risk in adults: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. J Am Coll Cardiol 2014;63(25 Pt B):2889–2934. - PubMed
    1. January CT, Wann LS, Alpert JS, Calkins H, Cigarroa JE, Cleveland JC, Conti JB, Ellinor PT, Ezekowitz MD, Field ME, Murray KT, Sacco RL, Stevenson WG, Tchou PJ, Tracy CM, Yancy CW, ACC/AHA Task Force Members. 2014 AHA/ACC/HRS guideline for the management of patients with atrial fibrillation: executive summary: a report of the American College of Cardiology/American Heart Association Task Force on practice guidelines and the Heart Rhythm Society. Circulation 2014;130:2071–2104. - PubMed
    1. Morrow DA, Antman EM, Charlesworth A, Cairns R, Murphy SA, de Lemos JA, Giugliano RP, McCabe CH, Braunwald E.. TIMI risk score for ST-elevation myocardial infarction: a convenient, bedside, clinical score for risk assessment at presentation: an intravenous nPA for treatment of infarcting myocardium early II trial substudy. Circulation 2000;102:2031–2037. - PubMed
    1. Conroy RM, Pyörälä K, Fitzgerald AP, Sans S, Menotti A, De Backer G, De Bacquer D, Ducimetière P, Jousilahti P, Keil U, Njølstad I, Oganov RG, Thomsen T, Tunstall-Pedoe H, Tverdal A, Wedel H, Whincup P, Wilhelmsen L, Graham IM, SCORE project group. Estimation of ten-year risk of fatal cardiovascular disease in Europe: the SCORE project. Eur Heart J 2003;24:987–1003. - PubMed
    1. Wilson PW, D'Agostino RB, Levy D, Belanger AM, Silbershatz H, Kannel WB. Prediction of coronary heart disease using risk factor categories. Circulation 1998;97:1837–1847. - PubMed