One statistical test is sufficient for assessing new predictive markers

Andrew J Vickers¹, Angel M Cronin, Colin B Begg

Affiliations

PMID: 21276237
PMCID: PMC3042425
DOI: 10.1186/1471-2288-11-13

One statistical test is sufficient for assessing new predictive markers

Andrew J Vickers et al. BMC Med Res Methodol. 2011.

. 2011 Jan 28:11:13.

doi: 10.1186/1471-2288-11-13.

Authors

Andrew J Vickers¹, Angel M Cronin, Colin B Begg

Affiliation

¹ Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, Box 44, New York, NY 10065 USA. vickersa@mskcc.org

PMID: 21276237
PMCID: PMC3042425
DOI: 10.1186/1471-2288-11-13

Abstract

Background: We have observed that the area under the receiver operating characteristic curve (AUC) is increasingly being used to evaluate whether a novel predictor should be incorporated in a multivariable model to predict risk of disease. Frequently, investigators will approach the issue in two distinct stages: first, by testing whether the new predictor variable is significant in a multivariable regression model; second, by testing differences between the AUC of models with and without the predictor using the same data from which the predictive models were derived. These two steps often lead to discordant conclusions.

Discussion: We conducted a simulation study in which two predictors, X and X*, were generated as standard normal variables with varying levels of predictive strength, represented by means that differed depending on the binary outcome Y. The data sets were analyzed using logistic regression, and likelihood ratio and Wald tests for the incremental contribution of X* were performed. The patient-specific predictors for each of the models were then used as data for a test comparing the two AUCs. Under the null, the size of the likelihood ratio and Wald tests were close to nominal, but the area test was extremely conservative, with test sizes less than 0.006 for all configurations studied. Where X* was associated with outcome, the area test had much lower power than the likelihood ratio and Wald tests.

Summary: Evaluation of the statistical significance of a new predictor when there are existing clinical predictors is most appropriately accomplished in the context of a regression model. Although comparison of AUCs is a conceptually equivalent approach to the likelihood ratio and Wald test, it has vastly inferior statistical properties. Use of both approaches will frequently lead to inconsistent conclusions. Nonetheless, comparison of receiver operating characteristic curves remains a useful descriptive tool for initial evaluation of whether a new predictor might be of clinical relevance.

PubMed Disclaimer

References

1. Harrell FE Jr, Califf RM, Pryor DB, Lee KL, Rosati RA. Evaluating the yield of medical tests. JAMA. 1982;247(18):2543–2546. doi: 10.1001/jama.247.18.2543. - DOI - PubMed
1. Kattan MW. Judging new markers by their ability to improve predictive accuracy. J Natl Cancer Inst. 2003;95(9):634–635. doi: 10.1093/jnci/95.9.634. - DOI - PubMed
1. Pepe MS, Janes H, Longton G, Leisenring W, Newcomb P. Limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker. Am J Epidemiol. 2004;159(9):882–890. doi: 10.1093/aje/kwh101. - DOI - PubMed
1. Cook NR. Statistical evaluation of prognostic versus diagnostic models: beyond the ROC curve. Clin Chem. 2008;54(1):17–23. doi: 10.1373/clinchem.2007.096529. - DOI - PubMed
1. Hlatky MA, Greenland P, Arnett DK, Ballantyne CM, Criqui MH, Elkind MS, Go AS, Harrell FE Jr, Hong Y, Howard BV. et al. Criteria for evaluation of novel markers of cardiovascular risk: a scientific statement from the American Heart Association. Circulation. 2009;119(17):2408–2416. doi: 10.1161/CIRCULATIONAHA.109.192278. - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

One statistical test is sufficient for assessing new predictive markers

Affiliation

One statistical test is sufficient for assessing new predictive markers

Authors

Affiliation

Abstract

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources