Review

. 2012 Sep;31(9):2111-6.

doi: 10.1007/s10096-012-1602-1. Epub 2012 Mar 29.

Methods and recommendations for evaluating and reporting a new diagnostic test

A S Hess¹, M Shardell, J K Johnson, K A Thom, P Strassle, G Netzer, A D Harris

Affiliations

PMID: 22476385
PMCID: PMC3661219
DOI: 10.1007/s10096-012-1602-1

Review

Methods and recommendations for evaluating and reporting a new diagnostic test

A S Hess et al. Eur J Clin Microbiol Infect Dis. 2012 Sep.

. 2012 Sep;31(9):2111-6.

doi: 10.1007/s10096-012-1602-1. Epub 2012 Mar 29.

Authors

A S Hess¹, M Shardell, J K Johnson, K A Thom, P Strassle, G Netzer, A D Harris

Affiliation

¹ University of Maryland School of Medicine, Baltimore, MD, USA. ahess@epi.umaryland.edu

PMID: 22476385
PMCID: PMC3661219
DOI: 10.1007/s10096-012-1602-1

Abstract

No standardized guidelines exist for the biostatistical methods appropriate for studies evaluating diagnostic tests. Publication recommendations such as the STARD statement provide guidance for the analysis of data, but biostatistical advice is minimal and application is inconsistent. This article aims to provide a self-contained, accessible resource on the biostatistical aspects of study design and reporting for investigators. For all dichotomous diagnostic tests, estimates of sensitivity and specificity should be reported with confidence intervals. Power calculations are strongly recommended to ensure that investigators achieve desired levels of precision. In the absence of a gold standard reference test, the composite reference standard method is recommended for improving estimates of the sensitivity and specificity of the test under evaluation.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest JKJ has received funding from Becton Dickinson. The other authors declare that they have no conflict of interest.

Figures

**Fig. 1**
a A 2×2 paired contingency table for comparing the results of two tests on the same samples. b Results of a comparison between a new assay (“*Test A*”) and a gold standard assay

**Fig. 2**
95 % confidence intervals around estimates of sensitivity. Both tests have a sensitivity of 60 %. Test one (*upper*) has a 95 % confidence interval of (52 %, 68 %). Test two (*lower*) has a 95 % confidence interval of (40 %, 80 %). The 95 % confidence interval around the estimate of the sensitivity for test one is narrower that for test two, therefore the estimate is more precise

**Fig. 3**
Formula for 95 % confidence intervals for sensitivity or specificity. p● is the estimate of sensitivity or specificity, n is either the number of true-positive samples (for sensitivity) or the number of true-negative samples (for specificity). This formula is appropriate as long as both *np●*and n(1−p●) are not less than 5

**Fig. 4**
Three-step method to approximate the sample size n* with 90 % power to estimate p with a margin of error no more than M. Step 1 calculates a preliminary estimate n based on p●, the estimated sensitivity or specificity and M. Step 2 gives ‘power’ to the sample size estimate by calculating p*, or the 90 % lower bound around p● given n. Step 3 calculates n* using the same equation as step 1, but substituting p* for p●

**Fig. 5**
a Summary of the two stages of a composite reference standard (CRS) test of a new test (N). Samples labeled negative by the imperfect standard (S) are re-tested with the third test, the imperfect ‘resolver’ (R). b Example showing the two stages of a CRS resolution of the new test, “*Test A*”

**Fig. 6**
a Formulas for calculating sensitivity and specificity using a composite reference standard method. b Formulas for calculating 95 % confidence intervals around composite reference standard (CRS) estimates of sensitivity and specificity. See Fig. 5a for reference

See this image and copyright information in PMC

References

1. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, et al. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Ann Intern Med. 2003;138(1):40–44. - PubMed
1. Pfeifer J, editor. Molecular genetic testing in surgical pathology. Lippincott Williams & Wilkins; Philadelphia: 2006.
1. Rosner BA. Fundamentals of biostatistics. 6. Thomson Brooks Cole; Belmont, CA: 2006.
1. FDA. Statistical guidance on reporting results from studies evaluating diagnostic tests. 2011 Available from: http://www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/GuidanceDo.... Updated 6 January 2011; cited 8 December 2011.
1. Royse D, Thyer BA, Padgett DK. Program evaluation: An introduction. 5. Wadsworth, Cengage Learning; Belmont, CA: 2010.

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Methods and recommendations for evaluating and reporting a new diagnostic test

Affiliation

Methods and recommendations for evaluating and reporting a new diagnostic test

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical