Understanding increments in model performance metrics

Michael J Pencina¹, Ralph B D'Agostino, Joseph M Massaro

Affiliations

Affiliation

¹ Department of Biostatistics, Harvard Clinical Research Institute, Boston University, CrossTown, 801 Massachusetts Ave., Boston, MA 02118, USA. mpencina@bu.edu

PMID: 23242535
PMCID: PMC3656609
DOI: 10.1007/s10985-012-9238-0

Understanding increments in model performance metrics

Michael J Pencina et al. Lifetime Data Anal. 2013 Apr.

. 2013 Apr;19(2):202-18.

doi: 10.1007/s10985-012-9238-0. Epub 2012 Dec 16.

Authors

Michael J Pencina¹, Ralph B D'Agostino, Joseph M Massaro

Affiliation

¹ Department of Biostatistics, Harvard Clinical Research Institute, Boston University, CrossTown, 801 Massachusetts Ave., Boston, MA 02118, USA. mpencina@bu.edu

PMID: 23242535
PMCID: PMC3656609
DOI: 10.1007/s10985-012-9238-0

Abstract

The area under the receiver operating characteristic curve (AUC) is the most commonly reported measure of discrimination for prediction models with binary outcomes. However, recently it has been criticized for its inability to increase when important risk factors are added to a baseline model with good discrimination. This has led to the claim that the reliance on the AUC as a measure of discrimination may miss important improvements in clinical performance of risk prediction rules derived from a baseline model. In this paper we investigate this claim by relating the AUC to measures of clinical performance based on sensitivity and specificity under the assumption of multivariate normality. The behavior of the AUC is contrasted with that of discrimination slope. We show that unless rules with very good specificity are desired, the change in the AUC does an adequate job as a predictor of the change in measures of clinical performance. However, stronger or more numerous predictors are needed to achieve the same increment in the AUC for baseline models with good versus poor discrimination. When excellent specificity is desired, our results suggest that the discrimination slope might be a better measure of model improvement than AUC. The theoretical results are illustrated using a Framingham Heart Study example of a model for predicting the 10-year incidence of atrial fibrillation.

PubMed Disclaimer

Figures

**Figure 1**
Sensitivity at constant Specificity as function of AUC

**Figure 2**
Sensitivity at constant Specificity as function of Discrimination Slope

**Figure 3**
Youden Index and Relative Utility as function of AUC

**Figure 4**
Youden Index and Relative Utility as function of Discrimination Slope

**Figure 5**
Youden Index and Relative Utility as function of AUC

**Figure 6**
Youden Index and Relative Utility as function of Discrimination Slope

See this image and copyright information in PMC

References

1. Baker SG, Cook NR, Vickers A, et al. Using relative utility curves to evaluate risk prediction. J R Stat Soc Ser A Stat Soc. 2009;172(4):729–748. - PMC - PubMed
1. Cook NR. Use and misuse of the receiver operating characteristics curve in risk prediction. Circulation. 2007;115(7):928–935. - PubMed
1. Cox DR. Regression Models and Life Tables. J. R. Statist. Soc. Series B. 1972;34:187–220.
1. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing areas under two or more correlated reciever operating characteristics curves: a nonparamentric approach. Biometrics. 1988;44(3):837–845. - PubMed
1. Demler OV, Pencina MJ, D’Agostino RB., Sr. Misuse of DeLong test to compare AUCs for nested models. Statist Med. 2012;31:2577–2587. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect - Access expert opinions and insights on biomedical research.
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Understanding increments in model performance metrics

Affiliation

Understanding increments in model performance metrics

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources