Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 15:384:e074820.
doi: 10.1136/bmj-2023-074820.

Evaluation of clinical prediction models (part 2): how to undertake an external validation study

Affiliations

Evaluation of clinical prediction models (part 2): how to undertake an external validation study

Richard D Riley et al. BMJ. .

Abstract

External validation studies are an important but often neglected part of prediction model research. In this article, the second in a series on model evaluation, Riley and colleagues explain what an external validation study entails and describe the key steps involved, from establishing a high quality dataset to evaluating a model’s predictive performance and clinical usefulness.

PubMed Disclaimer

Conflict of interest statement

Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/disclosure-of-interest/ and declare: funding from the EPSRC, NIHR-MRC, NIHR Birmingham Biomedical Research Centre, and Cancer Research UK for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

Figures

Fig 1
Fig 1
Application of an existing prediction model to derive predicted values for each participant in the external validation dataset
Fig 2
Fig 2
Example of a binary outcome prediction model to be externally validated in new data. SD=standard deviation; IQR=interquartile range
Fig 3
Fig 3
Example of a time-to-event outcome prediction model to be externally validated in new data. SD=standard deviation; IQR=interquartile range
Fig 4
Fig 4
Calibration plots for binary outcome prediction model on external validation. Example shows probability of 30 day mortality after an acute myocardial infarction. Area below the dashed line=where the model’s risk estimates are too high; area above the dashed line=where the model’s risk estimates are too low; 10 circles=10 groups defined by tenths of the distribution of estimated risks; histograms at the bottom of graphs show the distribution of risk estimates for each outcome group
Fig 5
Fig 5
Calibration plots for time-to-event prediction model on external validation. Example shows time-to-event outcome: probability of five year recurrence after a diagnosis of primary breast cancer. Area below the dashed line=where the model’s risk estimates are too high; area above the dashed line=where the model’s risk estimates are too low; 10 circles=10 groups defined by tenths of the distribution of estimated risks; histograms at the bottom of graphs show the distribution of risk estimates for each outcome group
Fig 6
Fig 6
Decision curves showing net benefit for binary outcome prediction model across a range of threshold probabilities that define when some clinical action (eg, treatment) is warranted. Example shows probability of 30 day mortality after an acute myocardial infarction. Threshold probability=risk needed to initiate a particular treatment or clinical action; positive values of net benefit indicate clinical utility; treat all=strategy of initiating the particular treatment (or clinical action) for all patients regardless of their estimated risk; treat none=strategy of not initiating the treatment (or clinical action) for any patient; treat per model=strategy of initiating the treatment (or clinical action) for those patients whose estimated risk is at or above the threshold probability. An interactive version of this graphic is available at: https://public.flourish.studio/visualisation/15175981/
Fig 7
Fig 7
Decision curves showing net benefit for time-to-event prediction model across a range of threshold probabilities that define when some clinical action (eg, treatment) is warranted. Example shows probability of five year recurrence after a diagnosis of primary breast cancer. Threshold probability=risk needed to initiate a particular treatment or clinical action; positive values of net benefit indicate clinical utility; treat all=strategy of initiating the particular treatment (or clinical action) for all patients regardless of their estimated risk; treat none=strategy of not initiating the treatment (or clinical action) for any patient; treat per model=strategy of initiating the treatment (or clinical action) for those patients whose estimated risk is at or above the threshold probability. An interactive version of this graphic is available at: https://public.flourish.studio/visualisation/15162451/

References

    1. Collins GS, Dhiman P, Ma J, et al. . Evaluation of clinical prediction models (part 1): from development to external validation. BMJ 2023;383:e074819. 10.1136/bmj-2023-074819 - DOI - PMC - PubMed
    1. McLernon DJ, Giardiello D, Van Calster B, et al. topic groups 6 and 8 of the STRATOS Initiative . Assessing Performance and Clinical Usefulness in Prediction Models With Survival Outcomes: Practical Guidance for Cox Proportional Hazards Models. Ann Intern Med 2023;176:105-14. - PubMed
    1. Sperrin M, Riley RD, Collins GS, Martin GP. Targeted validation: validating clinical prediction models in their intended population and setting. Diagn Progn Res 2022;6:24. 10.1186/s41512-022-00136-8 - DOI - PMC - PubMed
    1. Riley RD, Ensor J, Snell KI, et al. . External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: opportunities and challenges. BMJ 2016;353:i3140. 10.1136/bmj.i3140 - DOI - PMC - PubMed
    1. Debray TP, Vergouwe Y, Koffijberg H, Nieboer D, Steyerberg EW, Moons KG. A new framework to enhance the interpretation of external validation studies of clinical prediction models. J Clin Epidemiol 2015;68:279-89. 10.1016/j.jclinepi.2014.06.018 - DOI - PubMed