Regression trees for predicting mortality in patients with cardiovascular disease: what improvement is achieved by using ensemble-based methods?

Peter C Austin¹, Douglas S Lee, Ewout W Steyerberg, Jack V Tu

Affiliations

PMID: 22777999
PMCID: PMC3470596
DOI: 10.1002/bimj.201100251

Free PMC article

Regression trees for predicting mortality in patients with cardiovascular disease: what improvement is achieved by using ensemble-based methods?

Peter C Austin et al. Biom J. 2012 Sep.

Free PMC article

. 2012 Sep;54(5):657-73.

doi: 10.1002/bimj.201100251. Epub 2012 Jul 6.

Authors

Peter C Austin¹, Douglas S Lee, Ewout W Steyerberg, Jack V Tu

Affiliation

¹ Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada. peter.austin@ices.on.ca

PMID: 22777999
PMCID: PMC3470596
DOI: 10.1002/bimj.201100251

Abstract

In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999-2001 and 2004-2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease.

PubMed Disclaimer

Figures

**Figure 1**
Calibration plot in EFFECT2 AMIcohort.

**Figure 2**
Relationship between key continuous variables and log-odds of death.

**Figure 3**
Distribution of predicted probabilities of death in AMI sample.

**Figure 4**
Calibration plot in EFFECT2 CHF cohort.

See this image and copyright information in PMC

References

1. Austin PC. A comparison of regression trees, logistic regression, generalized additive models, and multivariate adaptive regression splines for predicting AMI mortality. Statistics in Medicine. 2007;26:2937–2957. - PubMed
1. Breiman L. Random forests. Machine Learning. 2001;45:5–32.
1. Breiman L, Freidman JH, Olshen RA, Stone CJ. Classification and Regression Trees. Boca Raton: Chapman & Hall/CRC; 1998.
1. Buhlmann P, Hathorn T. Boosting algorithms: Regularization, prediction and model fitting. Statistical Science. 2007;22:477–505.
1. Clark LA, Pregibon D. Tree-based methods. In: Chambers JM, Hastie TJ, editors. Statistical Models in S. New York, NY: Chapman & Hall; 1993.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Regression trees for predicting mortality in patients with cardiovascular disease: what improvement is achieved by using ensemble-based methods?

Affiliation

Regression trees for predicting mortality in patients with cardiovascular disease: what improvement is achieved by using ensemble-based methods?

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources