Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jun 15;34(13):2081-103.
doi: 10.1002/sim.6471. Epub 2015 Mar 20.

Summarising and validating test accuracy results across multiple studies for use in clinical practice

Affiliations

Summarising and validating test accuracy results across multiple studies for use in clinical practice

Richard D Riley et al. Stat Med. .

Abstract

Following a meta-analysis of test accuracy studies, the translation of summary results into clinical practice is potentially problematic. The sensitivity, specificity and positive (PPV) and negative (NPV) predictive values of a test may differ substantially from the average meta-analysis findings, because of heterogeneity. Clinicians thus need more guidance: given the meta-analysis, is a test likely to be useful in new populations, and if so, how should test results inform the probability of existing disease (for a diagnostic test) or future adverse outcome (for a prognostic test)? We propose ways to address this. Firstly, following a meta-analysis, we suggest deriving prediction intervals and probability statements about the potential accuracy of a test in a new population. Secondly, we suggest strategies on how clinicians should derive post-test probabilities (PPV and NPV) in a new population based on existing meta-analysis results and propose a cross-validation approach for examining and comparing their calibration performance. Application is made to two clinical examples. In the first example, the joint probability that both sensitivity and specificity will be >80% in a new population is just 0.19, because of a low sensitivity. However, the summary PPV of 0.97 is high and calibrates well in new populations, with a probability of 0.78 that the true PPV will be at least 0.95. In the second example, post-test probabilities calibrate better when tailored to the prevalence in the new population, with cross-validation revealing a probability of 0.97 that the observed NPV will be within 10% of the predicted NPV.

Keywords: calibration; diagnostic; discrimination; meta-analysis; prognostic; test accuracy.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Confidence and prediction regions following application of model (1) to the temperature data.
Figure 2
Figure 2
Meta‐analysis of the observed/expected (O/E) calibration statistics (frequentist estimation of model (16)) from the internal–external cross‐validation approach applied to the ear temperature data for diagnosis of fever. PPV, positive predictive value; NPV, negative predictive value.
Figure 3
Figure 3
Meta‐analysis of the observed/expected (O/E) calibration statistics (frequentist estimation of model (16)) from the internal–external cross‐validation approach applied to the parathyroid data at 1–2 h for prediction of hypocalcaemia. PPV, positive predictive value; NPV, negative predictive value.
Figure 4
Figure 4
Calibration of predicted and observed post‐test probabilities, for (a) positive predictive value (PPV) derived using option A in the temperature example and (b) negative predictive value (NPV) derived using option B in the parathyroid example. Each circle represents a study and is proportional to the study sample size.
Figure 5
Figure 5
Posterior distributions for the true observed/expected (O/E), positive predictive value (PPV) or negative predictive value (NPV) in a new population, derived from Bayesian estimation of model (17).

References

    1. Deeks JJ. Systematic reviews in health care: systematic reviews of evaluations of diagnostic and screening tests. BMJ 2001; 323:157–162. - PMC - PubMed
    1. Reitsma JB, Glas AS, Rutjes AW, Scholten RJ, Bossuyt PM, Zwinderman AH. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. Journal of Clinical Epidemiology 2005; 58:982–990. - PubMed
    1. Harbord RM, Deeks JJ, Egger M, Whiting P, Sterne JA. A unification of models for meta‐analysis of diagnostic accuracy studies. Biostatistics 2007; 8:239–251. - PubMed
    1. Chu H, Cole SR. Bivariate meta‐analysis of sensitivity and specificity with sparse data: a generalized linear mixed model approach. Journal of Clinical Epidemiology 2006;59 1331–1332. author reply 1332–1333. - PubMed
    1. Leeflang MM, Deeks JJ, Rutjes AW, Reitsma JB, Bossuyt PM. Bivariate meta‐analysis of predictive values of diagnostic tests can be an alternative to bivariate meta‐analysis of sensitivity and specificity. Journal of Clinical Epidemiology 2012; 65:1088–1097. - PubMed

Publication types