How to Critically Appraise and Interpret Systematic Reviews and Meta-Analyses of Diagnostic Accuracy: A User Guide

doi:10.1148/radiol.221437

Review

. 2023 May;307(3):e221437.

doi: 10.1148/radiol.221437. Epub 2023 Mar 14.

How to Critically Appraise and Interpret Systematic Reviews and Meta-Analyses of Diagnostic Accuracy: A User Guide

Robert A Frank^#¹, Jean-Paul Salameh^#¹, Nayaar Islam¹, Bada Yang¹, Mohammad Hassan Murad¹, Reem Mustafa¹, Mariska Leeflang¹, Patrick M Bossuyt¹, Yemisi Takwoingi¹, Penny Whiting¹, Haben Dawit¹, Stella K Kang¹, Sanam Ebrahimzadeh¹, Brooke Levis¹, Brian Hutton¹, Matthew D F McInnes¹

Affiliations

Affiliation

¹ From the Department of Radiology, University of Ottawa, The Ottawa Hospital Civic Campus, 1053 Carling Ave, Room c159, Ottawa, ON, Canada K1Y 4E9 (R.A.F., M.D.F.M.); Faculty of Health Sciences, Queen's University, Kingston, Ontario, Canada (J.P.S.); Clinical Epidemiology Program, Ottawa Hospital Research Institute, University of Ottawa, Ottawa, Canada (N.I., M.H.M., H.D., S.E., B.H.); Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands (B.Y.); Evidence-Based Practice Center, Mayo Clinic, Rochester, Minn (M.H.M.); Department of Medicine, Division of Nephrology and Hypertension, University of Kansas Medical Center, Kansas City, Mo (R.M.); Department of Epidemiology and Data Science, Amsterdam UMC, University of Amsterdam, Amsterdam, the Netherlands (M.L., P.M.B.); Amsterdam Public Health, Amsterdam, the Netherlands (P.M.B.); Institute of Applied Health Research, University of Birmingham, Birmingham, UK (Y.T.); NIHR Birmingham Biomedical Research Centre, University Hospitals Birmingham NHS Foundation Trust and University of Birmingham, Birmingham, UK (Y.T.); Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK (P.W.); Department of Radiology, NYU Langone Health, New York, NY (S.K.K.); and Centre for Clinical Epidemiology, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Canada (B.L.).

^# Contributed equally.

PMID: 36916896
PMCID: PMC10140638
DOI: 10.1148/radiol.221437

Review

How to Critically Appraise and Interpret Systematic Reviews and Meta-Analyses of Diagnostic Accuracy: A User Guide

Robert A Frank et al. Radiology. 2023 May.

. 2023 May;307(3):e221437.

doi: 10.1148/radiol.221437. Epub 2023 Mar 14.

Authors

Affiliation

¹ From the Department of Radiology, University of Ottawa, The Ottawa Hospital Civic Campus, 1053 Carling Ave, Room c159, Ottawa, ON, Canada K1Y 4E9 (R.A.F., M.D.F.M.); Faculty of Health Sciences, Queen's University, Kingston, Ontario, Canada (J.P.S.); Clinical Epidemiology Program, Ottawa Hospital Research Institute, University of Ottawa, Ottawa, Canada (N.I., M.H.M., H.D., S.E., B.H.); Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands (B.Y.); Evidence-Based Practice Center, Mayo Clinic, Rochester, Minn (M.H.M.); Department of Medicine, Division of Nephrology and Hypertension, University of Kansas Medical Center, Kansas City, Mo (R.M.); Department of Epidemiology and Data Science, Amsterdam UMC, University of Amsterdam, Amsterdam, the Netherlands (M.L., P.M.B.); Amsterdam Public Health, Amsterdam, the Netherlands (P.M.B.); Institute of Applied Health Research, University of Birmingham, Birmingham, UK (Y.T.); NIHR Birmingham Biomedical Research Centre, University Hospitals Birmingham NHS Foundation Trust and University of Birmingham, Birmingham, UK (Y.T.); Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK (P.W.); Department of Radiology, NYU Langone Health, New York, NY (S.K.K.); and Centre for Clinical Epidemiology, Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Canada (B.L.).

^# Contributed equally.

PMID: 36916896
PMCID: PMC10140638
DOI: 10.1148/radiol.221437

Abstract

Systematic reviews of diagnostic accuracy studies can provide the best available evidence to inform decisions regarding the use of a diagnostic test. In this guide, the authors provide a practical approach for clinicians to appraise diagnostic accuracy systematic reviews and apply their results to patient care. The first step is to identify an appropriate systematic review with a research question matching the clinical scenario. The user should evaluate the rigor of the review methods to evaluate its credibility (Did the review use clearly defined eligibility criteria, a comprehensive search strategy, structured data collection, risk of bias and applicability appraisal, and appropriate meta-analysis methods?). If the review is credible, the next step is to decide whether the diagnostic performance is adequate for clinical use (Do sensitivity and specificity estimates exceed the threshold that makes them useful in clinical practice? Are these estimates sufficiently precise? Is variability in the estimates of diagnostic accuracy across studies explained?). Diagnostic accuracy systematic reviews that are judged to be credible and provide diagnostic accuracy estimates with sufficient certainty and relevance are the most useful to inform patient care. This review discusses comparative, noncomparative, and emerging approaches to systematic reviews of diagnostic accuracy using a clinical scenario and examples based on recent publications.

PubMed Disclaimer

Conflict of interest statement

Disclosures of conflicts of interest: R.A.F. No relevant relationships. J.P.S. No relevant relationships. N.I. No relevant relationships. B.Y. No relevant relationships. M.H.M. No relevant relationships. R.M. No relevant relationships. M.L. No relevant relationships. P.M.B. Consultant for Radiology. Y.T. No relevant relationships. P.W. No relevant relationships. H.D. No relevant relationships. S.K.K. Grants from NIH/NCI, NIH/MIDCR; ACRQ, and Doris Duke Foundation; royalties from Wolters Kluwer; honoraria for editorial board work from ARRS and teaching from RSNA; Chair of American College of Radiology Steering Committee on Incidental Findings and American College of Radiology Expert Panel on Gynecological and Obstetrical Imaging. S.E. No relevant relationships. B.L. No relevant relationships. B.H. Patents planned, issued, or pending from Ottawa Hospital Research Institute. M.D.F.M. CIHR operating grant to institution; associate editor for Radiology.

Figures

Example of presentation of Quality Assessment of Diagnostic Accuracy
Studies 2 (QUADAS-2) result. Chart shows risk of bias and applicability summary
with authors’ judgments about each domain for each included study.
Reprinted, with permission, from reference 84. — **Figure 1:**
Example of presentation of Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) result. Chart shows risk of bias and applicability summary with authors’ judgments about each domain for each included study. Reprinted, with permission, from reference .

Example of a coupled forest plot of sensitivity and specificity. Published
data from Duke et al (24) were re-analyzed by fitting a bivariate meta-analysis
using the lme4 package in R (version 4.2.2; The R Foundation for Statistical
Computing). This figure was generated using RevMan software (Cochrane;
https://training.cochrane.org/online-learning/core-software/revman#:~:text=ReviewManager%20(RevMan)%20is%20Cochrane's%20bespoke%20software%20for%20writing%20Cochrane%20Reviews).
Each study is identified by name of first author and year of publication, with
blue squares representing individual study point estimates and horizontal lines
indicating 95% CIs. For each primary study, values of sensitivity and
specificity with associated upper and lower limits of the 95% CIs are provided,
in addition to 2 × 2 data. Summary estimates of sensitivity and
specificity are 0.965 (95% CI: 0.948, 0.977) and 0.967 (95%CI: 0.949,0.979) (not
displayed in this forest plot). FN = false-negative, FP = false-negative, TN =
true-negative, TP = true-positive. — **Figure 2:**
Example of a coupled forest plot of sensitivity and specificity. Published data from Duke et al (24) were re-analyzed by fitting a bivariate meta-analysis using the lme4 package in R (version 4.2.2; The R Foundation for Statistical Computing). This figure was generated using RevMan software (Cochrane; *https://training.cochrane.org/online-learning/core-software/revman#:~:text=ReviewManager%20(RevMan)%20is%20Cochrane's%20bespoke%20software%20for%20writing%20Cochrane%20Reviews*). Each study is identified by name of first author and year of publication, with blue squares representing individual study point estimates and horizontal lines indicating 95% CIs. For each primary study, values of sensitivity and specificity with associated upper and lower limits of the 95% CIs are provided, in addition to 2 × 2 data. Summary estimates of sensitivity and specificity are 0.965 (95% CI: 0.948, 0.977) and 0.967 (95%CI: 0.949,0.979) (not displayed in this forest plot). FN = false-negative, FP = false-negative, TN = true-negative, TP = true-positive.

Examples of summary receiver operating characteristic (ROC) plots.
Published data from Duke et al (24) were re-analyzed by fitting a bivariate
meta-analysis using the lme4 package in R (The R Foundation for Statistical
Computing) (3) to obtain a (A) summary point and a (B) summary curve. Figures
were generated using RevMan software. The open circles represent accuracy
estimates for included primary studies, with the position relative to the y-axis
representing the sensitivity and the position relative to the x-axis
representing the false-positive rate (1 − specificity). Individual study
points are weighted using sample size. In A, the position of the solid black
circle in ROC space represents the summary estimate of sensitivity and
specificity (sensitivity = 0.965 [95% CI: 0.948, 0.977]; specificity = 0.967
[95% CI: 0.949, 0.979]). The dashed circle represents the 95% prediction region,
and the dotted line represents the 95% confidence region. In B, the solid line
represents the hierarchical summary ROC curve. The dashed line denotes an
uninformative ROC curve. Points along this curve correspond to where sensitivity
= 1 − specificity (ie, the true- and false-positive rates are
equal). — **Figure 3:**
Examples of summary receiver operating characteristic (ROC) plots. Published data from Duke et al (24) were re-analyzed by fitting a bivariate meta-analysis using the lme4 package in R (The R Foundation for Statistical Computing) (3) to obtain a **(A)** summary point and a **(B)** summary curve. Figures were generated using RevMan software. The open circles represent accuracy estimates for included primary studies, with the position relative to the y-axis representing the sensitivity and the position relative to the x-axis representing the false-positive rate (1 − specificity). Individual study points are weighted using sample size. In A, the position of the solid black circle in ROC space represents the summary estimate of sensitivity and specificity (sensitivity = 0.965 [95% CI: 0.948, 0.977]; specificity = 0.967 [95% CI: 0.949, 0.979]). The dashed circle represents the 95% prediction region, and the dotted line represents the 95% confidence region. In B, the solid line represents the hierarchical summary ROC curve. The dashed line denotes an uninformative ROC curve. Points along this curve correspond to where sensitivity = 1 − specificity (ie, the true- and false-positive rates are equal).

Example of summary receiver operating characteristic (ROC) plot with
studies demonstrating higher between-study variability and lower accuracy
estimates compared with those in Figure 3. The summary point is indicated by the
black circle. Individual studies are indicated by open circles (scale = study
sample size). The dotted border and the dashed border represent 95% confidence
regions and 95% prediction regions, respectively. The dashed diagonal line
denotes an uninformative ROC curve. Points along this curve correspond to where
sensitivity = 1 − specificity (ie, the true- and false-positive rates are
equal). Reprinted, under a CC BY license, from reference 38. — **Figure 4:**
Example of summary receiver operating characteristic (ROC) plot with studies demonstrating higher between-study variability and lower accuracy estimates compared with those in Figure 3. The summary point is indicated by the black circle. Individual studies are indicated by open circles (scale = study sample size). The dotted border and the dashed border represent 95% confidence regions and 95% prediction regions, respectively. The dashed diagonal line denotes an uninformative ROC curve. Points along this curve correspond to where sensitivity = 1 − specificity (ie, the true- and false-positive rates are equal). Reprinted, under a CC BY license, from reference .

Example of summary receiver operating characteristic (ROC) plot with an
intermediate degree of between-study variability compared with the previously
presented examples. Hierarchical summary ROC (HSROC) curve for
fracture-detection algorithms. The 95% prediction region is a visual
representation of between-study heterogeneity. Reprinted, with permission, from
reference 41. — **Figure 5:**
Example of summary receiver operating characteristic (ROC) plot with an intermediate degree of between-study variability compared with the previously presented examples. Hierarchical summary ROC (HSROC) curve for fracture-detection algorithms. The 95% prediction region is a visual representation of between-study heterogeneity. Reprinted, with permission, from reference .

Linked summary receiver operating characteristic (ROC) plot displays
direct comparisons of two index tests. Plot is from the study by Chan et al
(51), who compared the accuracy of chest US (CUS) and chest radiography
(CXR) in the diagnosis of pneumothorax in trauma patients in the emergency
department. Solid circles (summary points) represent the summary estimates
of sensitivity and specificity for chest US (black circle) and chest
radiography (red circle). Each summary point is surrounded by a dotted line
of the same color representing the 95% confidence region and a dashed line
of the same color representing the 95% prediction region. The dashed
diagonal line extending denotes an uninformative ROC curve. Points along
this curve correspond to where sensitivity = 1 − specificity (ie, the
true- and false-positive rates are equal). Reprinted, under a CC BY license,
from reference 51. — **Figure 6:**
Linked summary receiver operating characteristic (ROC) plot displays direct comparisons of two index tests. Plot is from the study by Chan et al (51), who compared the accuracy of chest US (CUS) and chest radiography (CXR) in the diagnosis of pneumothorax in trauma patients in the emergency department. Solid circles (summary points) represent the summary estimates of sensitivity and specificity for chest US (black circle) and chest radiography (red circle). Each summary point is surrounded by a dotted line of the same color representing the 95% confidence region and a dashed line of the same color representing the 95% prediction region. The dashed diagonal line extending denotes an uninformative ROC curve. Points along this curve correspond to where sensitivity = 1 − specificity (ie, the true- and false-positive rates are equal). Reprinted, under a CC BY license, from reference .

See this image and copyright information in PMC

Cited by

SSR white paper: guidelines for utilization and performance of direct MR arthrography.
Chang EY, Bencardino JT, French CN, Fritz J, Hanrahan CJ, Jibri Z, Kassarjian A, Motamedi K, Ringler MD, Strickland CD, Tiegs-Heiden CA, Walker REA. Chang EY, et al. Skeletal Radiol. 2024 Feb;53(2):209-244. doi: 10.1007/s00256-023-04420-6. Epub 2023 Aug 11. Skeletal Radiol. 2024. PMID: 37566148 Free PMC article. Review.
Global view of haematolymphoid tumor classifications and their application in low- and middle-income countries.
Fedoriw Y, Silva O, Znaor A, Macintyre E. Fedoriw Y, et al. Histopathology. 2025 Jan;86(1):6-16. doi: 10.1111/his.15340. Epub 2024 Oct 17. Histopathology. 2025. PMID: 39420576 Review.

References

1. Salameh JP , Bossuyt PM , McGrath TA , et al. . Preferred reporting items for systematic review and meta-analysis of diagnostic test accuracy studies (PRISMA-DTA): explanation, elaboration, and checklist . BMJ 2020. ; 370 : m2632 . - PubMed
1. Cohen JF , Deeks JJ , Hooft L , et al. . Preferred reporting items for journal and conference abstracts of systematic reviews and meta-analyses of diagnostic test accuracy studies (PRISMA-DTA for Abstracts): checklist, explanation, and elaboration . BMJ 2021. ; 372 : n265 . - PMC - PubMed
1. Frank RA , McInnes MDF , Levine D , et al. . Are Study and Journal Characteristics Reliable Indicators of “Truth” in Imaging Research? Radiology 2018. ; 287 ( 1 ): 215 – 223 . - PubMed
1. Higgins JPT , Thomas J . Chandler J , et al. , eds. Cochrane Handbook for Systematic Reviews of Interventions version 6.0 . (updated July 2019). Cochrane, 2019. https://training.cochrane.org/handbook .
1. Patsopoulos NA , Analatos AA , Ioannidis JP . Relative citation impact of various study designs in the health sciences . JAMA 2005. ; 293 ( 19 ): 2362 – 2366 . - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

[1] Salameh JP , Bossuyt PM , McGrath TA , et al. . Preferred reporting items for systematic review and meta-analysis of diagnostic test accuracy studies (PRISMA-DTA): explanation, elaboration, and checklist . BMJ 2020. ; 370 : m2632 . - PubMed

[2] Salameh JP , Bossuyt PM , McGrath TA , et al. . Preferred reporting items for systematic review and meta-analysis of diagnostic test accuracy studies (PRISMA-DTA): explanation, elaboration, and checklist . BMJ 2020. ; 370 : m2632 . - PubMed

[3] Cohen JF , Deeks JJ , Hooft L , et al. . Preferred reporting items for journal and conference abstracts of systematic reviews and meta-analyses of diagnostic test accuracy studies (PRISMA-DTA for Abstracts): checklist, explanation, and elaboration . BMJ 2021. ; 372 : n265 . - PMC - PubMed

[4] Cohen JF , Deeks JJ , Hooft L , et al. . Preferred reporting items for journal and conference abstracts of systematic reviews and meta-analyses of diagnostic test accuracy studies (PRISMA-DTA for Abstracts): checklist, explanation, and elaboration . BMJ 2021. ; 372 : n265 . - PMC - PubMed

[5] Frank RA , McInnes MDF , Levine D , et al. . Are Study and Journal Characteristics Reliable Indicators of “Truth” in Imaging Research? Radiology 2018. ; 287 ( 1 ): 215 – 223 . - PubMed

[6] Frank RA , McInnes MDF , Levine D , et al. . Are Study and Journal Characteristics Reliable Indicators of “Truth” in Imaging Research? Radiology 2018. ; 287 ( 1 ): 215 – 223 . - PubMed

[7] Higgins JPT , Thomas J . Chandler J , et al. , eds. Cochrane Handbook for Systematic Reviews of Interventions version 6.0 . (updated July 2019). Cochrane, 2019. https://training.cochrane.org/handbook .

[8] Higgins JPT , Thomas J . Chandler J , et al. , eds. Cochrane Handbook for Systematic Reviews of Interventions version 6.0 . (updated July 2019). Cochrane, 2019. https://training.cochrane.org/handbook .

[9] Patsopoulos NA , Analatos AA , Ioannidis JP . Relative citation impact of various study designs in the health sciences . JAMA 2005. ; 293 ( 19 ): 2362 – 2366 . - PubMed

[10] Patsopoulos NA , Analatos AA , Ioannidis JP . Relative citation impact of various study designs in the health sciences . JAMA 2005. ; 293 ( 19 ): 2362 – 2366 . - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

How to Critically Appraise and Interpret Systematic Reviews and Meta-Analyses of Diagnostic Accuracy: A User Guide

Affiliation

How to Critically Appraise and Interpret Systematic Reviews and Meta-Analyses of Diagnostic Accuracy: A User Guide

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources