R2: a useful measure of model performance when predicting a dichotomous outcome
- PMID: 10070680
- DOI: 10.1002/(sici)1097-0258(19990228)18:4<375::aid-sim20>3.0.co;2-j
R2: a useful measure of model performance when predicting a dichotomous outcome
Abstract
R2 has been criticized as a measure of model performance when predicting a dichotomous outcome, both because its value is often low and because it is sensitive to the prevalence of the event of interest. The C statistic is more widely used to measure model performance in a 0/1 setting. We use a simple parametric family of models to illustrate the potential usefulness of models with low R2 values, to clarify the effect of prevalence on both C and R2, and to demonstrate how R2 captures information not picked up by C. We also show that C is subject to a 'random mixing' problem that does not affect R2. Finally, we report both R2 and C values for different risk-adjustment models in situations with different prevalences and show the relationship between the measures and decile death rates, thereby providing a context for interpreting R2 values in a 0/1 setting.