Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jul 10;30(15):1852-64.
doi: 10.1002/sim.4232. Epub 2011 Apr 15.

Bias in estimating accuracy of a binary screening test with differential disease verification

Affiliations

Bias in estimating accuracy of a binary screening test with differential disease verification

Todd A Alonzo et al. Stat Med. .

Abstract

Sensitivity, specificity, positive and negative predictive value are typically used to quantify the accuracy of a binary screening test. In some studies, it may not be ethical or feasible to obtain definitive disease ascertainment for all subjects using a gold standard test. When a gold standard test cannot be used, an imperfect reference test that is less than 100 per cent sensitive and specific may be used instead. In breast cancer screening, for example, follow-up for cancer diagnosis is used as an imperfect reference test for women where it is not possible to obtain gold standard results. This incomplete ascertainment of true disease, or differential disease verification, can result in biased estimates of accuracy. In this paper, we derive the apparent accuracy values for studies subject to differential verification. We determine how the bias is affected by the accuracy of the imperfect reference test, the percent who receive the imperfect reference standard test not receiving the gold standard, the prevalence of the disease, and the correlation between the results for the screening test and the imperfect reference test. It is shown that designs with differential disease verification can yield biased estimates of accuracy. Estimates of sensitivity in cancer screening trials may be substantially biased. However, careful design decisions, including selection of the imperfect reference test, can help to minimize bias. A hypothetical breast cancer screening study is used to illustrate the problem.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Diagram illustrating IDV and CDV designs.
Figure 2
Figure 2
Percent bias in sensitivity (solid curve) and specificity (dashed curve) for CDV design where the imperfect reference test is 100% specific but less than 100% sensitive. The percentage of results misclassified is varied. Prevalence is 10% and SpT = 0.7.
Figure 3
Figure 3
Percent bias in sensitivity resulting from CDV designs. Left panels have true prevalence of 10% while the right panels have prevalence of 30%. Top row: SeT=0.5, SpT=0.8, SeR=0.7, SpR=0.9; Middle row: SeT=0.7, SpT=0.9, SeR=0.7, SpR=0.9; Bottom row: SeT=0.9, SpT=0.9, SeR=0.7, SpR=0.7. Minimum TNNF (solid line), median TNNF (dashed line), and maximum TNNF (dotted line).
Figure 4
Figure 4
Percent bias in sensitivity resulting from IDV designs with τ = 0. Left panels have true prevalence of 10% while the right panels have prevalence of 30%. Top row: SeT=0.5, SpT=0.8, SeR=0.7, SpR=0.9, TPPF=0.3; Middle row: SeT=0.7, SpT=0.9, SeR=0.7, SpR=0.9, TPPF=0.5; Bottom row: SeT=0.9, SpT=0.9, SeR=0.7, SpR=0.7, TPPF=0.65. Minimum TNNF (solid line), median TNNF (dashed line), and maximum TNNF (dotted line).
Figure 5
Figure 5
Percent bias in PPV resulting from IDV designs with τ = 0. Left panels have true prevalence of 10% while the right panels have prevalence of 30%. Top row: SeT=0.5, SpT=0.8, SeR=0.7, SpR=0.9, TPPF=0.3; Middle row: SeT=0.7, SpT=0.9, SeR=0.7, SpR=0.9, TPPF=0.5; Bottom row: SeT=0.9, SpT=0.9, SeR=0.7, SpR=0.7, TPPF=0.65. Minimum TNNF (solid line), median TNNF (dashed line), and maximum TNNF (dotted line).

Similar articles

Cited by

References

    1. Lewin JM, Hendrick RE, D’Orsi CJ, Isaacs PK, Moss LJ, Karellas A, Sisney GA, Kuni CC, Cutter GR. Comparison of full-field digital mammography with screen film mammography for cancer detection: results of 4,945 paired examinations. Radiology. 2001;218:873–880. - PubMed
    1. Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, van der Meulen JHP, Bossuyt PM. Empirical evidence of design-related bias in studies of diagnostic tests. JAMA. 1999;282:1061–1066. - PubMed
    1. Glueck DH, Lamb MM, O’Donnell CI, Ringham BM, Brinton JT, Muller KE, Lewin JM, Alonzo TA, Pisano ED. Bias in trials comparing paired continuous tests can cause researchers to choose the wrong screening modality. BMC medical research methodology. 2009;9:4. - PMC - PubMed
    1. Ringham BM, Alonzo TA, Grunwald GK, Glueck DH. Estimates of observed sensitivity and specificity must be corrected when reporting the results of the second test in a screening trial conducted in series. BMC medical research methodology. 2010;10:3. - PMC - PubMed
    1. Rutjes AW, Reitsma JB, Di Nisio M, Smidt N, van Rijn JC, Bossuyt PMM. Evidence of bias and variation in diagnostic accuracy studies. CMAJ. 2006;174:1–12. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources