Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2012 Sep;31(9):2111-6.
doi: 10.1007/s10096-012-1602-1. Epub 2012 Mar 29.

Methods and recommendations for evaluating and reporting a new diagnostic test

Affiliations
Review

Methods and recommendations for evaluating and reporting a new diagnostic test

A S Hess et al. Eur J Clin Microbiol Infect Dis. 2012 Sep.

Abstract

No standardized guidelines exist for the biostatistical methods appropriate for studies evaluating diagnostic tests. Publication recommendations such as the STARD statement provide guidance for the analysis of data, but biostatistical advice is minimal and application is inconsistent. This article aims to provide a self-contained, accessible resource on the biostatistical aspects of study design and reporting for investigators. For all dichotomous diagnostic tests, estimates of sensitivity and specificity should be reported with confidence intervals. Power calculations are strongly recommended to ensure that investigators achieve desired levels of precision. In the absence of a gold standard reference test, the composite reference standard method is recommended for improving estimates of the sensitivity and specificity of the test under evaluation.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest JKJ has received funding from Becton Dickinson. The other authors declare that they have no conflict of interest.

Figures

Fig. 1
Fig. 1
a A 2×2 paired contingency table for comparing the results of two tests on the same samples. b Results of a comparison between a new assay (“Test A”) and a gold standard assay
Fig. 2
Fig. 2
95 % confidence intervals around estimates of sensitivity. Both tests have a sensitivity of 60 %. Test one (upper) has a 95 % confidence interval of (52 %, 68 %). Test two (lower) has a 95 % confidence interval of (40 %, 80 %). The 95 % confidence interval around the estimate of the sensitivity for test one is narrower that for test two, therefore the estimate is more precise
Fig. 3
Fig. 3
Formula for 95 % confidence intervals for sensitivity or specificity. p● is the estimate of sensitivity or specificity, n is either the number of true-positive samples (for sensitivity) or the number of true-negative samples (for specificity). This formula is appropriate as long as both np●and n(1−p●) are not less than 5
Fig. 4
Fig. 4
Three-step method to approximate the sample size n* with 90 % power to estimate p with a margin of error no more than M. Step 1 calculates a preliminary estimate n based on p●, the estimated sensitivity or specificity and M. Step 2 gives ‘power’ to the sample size estimate by calculating p*, or the 90 % lower bound around p● given n. Step 3 calculates n* using the same equation as step 1, but substituting p* for p●
Fig. 5
Fig. 5
a Summary of the two stages of a composite reference standard (CRS) test of a new test (N). Samples labeled negative by the imperfect standard (S) are re-tested with the third test, the imperfect ‘resolver’ (R). b Example showing the two stages of a CRS resolution of the new test, “Test A
Fig. 6
Fig. 6
a Formulas for calculating sensitivity and specificity using a composite reference standard method. b Formulas for calculating 95 % confidence intervals around composite reference standard (CRS) estimates of sensitivity and specificity. See Fig. 5a for reference

References

    1. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, et al. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Ann Intern Med. 2003;138(1):40–44. - PubMed
    1. Pfeifer J, editor. Molecular genetic testing in surgical pathology. Lippincott Williams & Wilkins; Philadelphia: 2006.
    1. Rosner BA. Fundamentals of biostatistics. 6. Thomson Brooks Cole; Belmont, CA: 2006.
    1. FDA. Statistical guidance on reporting results from studies evaluating diagnostic tests. 2011 Available from: http://www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/GuidanceDo.... Updated 6 January 2011; cited 8 December 2011.
    1. Royse D, Thyer BA, Padgett DK. Program evaluation: An introduction. 5. Wadsworth, Cengage Learning; Belmont, CA: 2010.

Publication types