Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2008 Oct 1;14(19):5977-83.
doi: 10.1158/1078-0432.CCR-07-4534.

Validation of biomarker-based risk prediction models

Affiliations
Review

Validation of biomarker-based risk prediction models

Jeremy M G Taylor et al. Clin Cancer Res. .

Abstract

The increasing availability and use of predictive models to facilitate informed decision making highlights the need for careful assessment of the validity of these models. In particular, models involving biomarkers require careful validation for two reasons: issues with overfitting when complex models involve a large number of biomarkers, and interlaboratory variation in assays used to measure biomarkers. In this article, we distinguish between internal and external statistical validation. Internal validation, involving training-testing splits of the available data or cross-validation, is a necessary component of the model building process and can provide valid assessments of model performance. External validation consists of assessing model performance on one or more data sets collected by different investigators from different institutions. External validation is a more rigorous procedure necessary for evaluating whether the predictive model will generalize to populations other than the one on which it was developed. We stress the need for an external data set to be truly external, that is, to play no role in model development and ideally be completely unavailable to the researchers building the model. In addition to reviewing different types of validation, we describe different types and features of predictive models and strategies for model building, as well as measures appropriate for assessing their performance in the context of validation. No single measure can characterize the different components of the prediction, and the use of multiple summary measures is recommended.

PubMed Disclaimer

Figures

Figure 1
Figure 1
ROC curves for PCPT risk calculator and PSA applied to the San Antonio cohort. Adapted from Urology, Vol. 68(6), Dipen J. Parekh, Donna Pauler Ankerst, Betsy A. Higgins, Javier Hernandez, Edith Canby-Hagino, Timothy Brand, Dean A. Troyer, Robin J. Leach and Ian M. Thompson, External validation of the Prostate Cancer Prevention Trial risk calculator in a screened population, 1152–5, Copyright (2006), with permission from Elsevier.

Similar articles

Cited by

References

    1. Thompson IM, Ankerst DP, Chi C, et al. Assessing prostate cancer risk: Results from the Prostate Cancer Prevention Trial. J Natl Cancer Inst. 2006;98:529–34. - PubMed
    1. Gail MH, Brinton LA, Byar DP, et al. Projecting individualized probabilities of developing breast cancer for white females who are being examined annually. J Natl Cancer Inst. 1989;81:1879–86. - PubMed
    1. Kattan MW, Eastham JA, Stapleton AMF, Wheeler TM, Scardino PT. A preoperative nomogram for disease recurrence following radical prostatectomy for prostate cancer. J Natl Cancer Inst. 1998;90:766–71. - PubMed
    1. Paik S, Shak S, Tang G, et al. A multi-gene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med. 2004;351:2817–26. - PubMed
    1. Skates SJ, Pauler DK, Jacobs IJ. Screening based on the risk of cancer calculation from Bayesian hierarchical change point and mixture models of longitudinal markers. J Am Stat Assoc. 2001;96:429–39.

Substances