Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 May 17;336(7653):1106-10.
doi: 10.1136/bmj.39500.677199.AE.

Grading quality of evidence and strength of recommendations for diagnostic tests and strategies

Collaborators, Affiliations

Grading quality of evidence and strength of recommendations for diagnostic tests and strategies

Holger J Schünemann et al. BMJ. .

Erratum in

  • BMJ. 2008 May 24;336(7654). doi: 10.1136/bmj.a139. Schünemann, A Holger J [corrected to Schünemann, Holger J]

Abstract

The GRADE system can be used to grade the quality of evidence and strength of recommendations for diagnostic tests or strategies. This article explains how patient-important outcomes are taken into account in this process

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors are members of the GRADE Working Group. The work with this group probably advanced the careers of some or all of the authors and group members. Authors listed in the byline have received travel reimbursement and honorariums for presentations that included a review of GRADE’s approach to grading the quality of evidence and strength of recommendations. GHG acts as a consultant to UpToDate; his work includes helping UpToDate in their use of GRADE. HJS is documents editor and methodologist for the American Thoracic Society; he supports the implementation of GRADE by this and other organisations worldwide. VMM supports the implementation of GRADE in several North American not for profit professional organisations.

Figures

None
Fig 1 Two generic ways in which a test or diagnostic strategy can be evaluated. On the left, patients are randomised to a new test or strategy or to an old test or strategy. Those with a positive test result (cases detected) are randomised (or were previously randomised) to receive the best available management (second step of randomisation for management not shown). Investigators evaluate and compare patient-important outcomes in all patients in both groups. On the right, patients receive both a new test and a reference test (old or comparator test or strategy). Investigators can then calculate the accuracy of the test compared with the reference test (first step). To make judgments about importance to patients of this information, patients with a positive test (or strategy) in either group are (or have been in previous studies) submitted to treatment or no treatment; investigators then evaluate and compare patient-important outcomes in all patients in both groups (second step)
None
Fig 2 Test and treatment thresholds. What clinicians expect of a good test is that results change the probability sufficiently to confirm or exclude a diagnosis. Tests, however, are altering only the probability of a disease of interest being present. If a test result moves the probability of the condition of interest to below the test threshold, this indicates that the condition is very unlikely, the downsides associated with any further testing and treatment for this condition outweigh any anticipated benefit, and no further testing or treatment for that condition should follow. If the test result increases the probability of disease to above the treatment threshold, this indicates that the condition is very likely, confirmatory testing that raises the probability of the condition further is unnecessary, and the anticipated benefits of treatment outweigh potential harms. If the pre-test probability is above the treatment threshold, further confirmatory testing that raises the probability further would not be helpful. If the pre-test probability is below the test threshold, further exclusionary testing would not be useful. When the probability is between the test and treatment thresholds, testing will be useful. Test results are of greatest value when they shift the probability across either threshold
None
Fig 3 Example of heterogeneity in diagnostic test results. Sensitivity and specificity of multislice coronary computed tomography compared with coronary angiogram (from Hamon et al4). This heterogeneity also existed for likelihood ratios and diagnostic odds ratios

References

    1. Deeks JJ. Systematic reviews in health care: systematic reviews of evaluations of diagnostic and screening tests. BMJ 2001;323:157-62. - PMC - PubMed
    1. Bossuyt PM, Irwig L, Craig J, Glasziou P. Comparative accuracy: assessing new tests against existing diagnostic pathways. BMJ 2006;332:1089-92. - PMC - PubMed
    1. Oxman AD, Guyatt GH. Guidelines for reading literature reviews. CMAJ 1988;138:697-703. - PMC - PubMed
    1. Mulrow C, Linn WD, Gaul MK, Pugh JA. Assessing quality of a diagnostic test evaluation. J Gen Intern Med 1989;4:288-95. - PubMed
    1. Guyatt G, Montori V, Devereaux PJ, Schünemann H, Bhandari M. Patients at the center: in our practice, and in our use of language. ACP J Club 2004;140(1):A11-2. - PubMed

Publication types