On Estimating Diagnostic Accuracy From Studies With Multiple Raters and Partial Gold Standard Evaluation

Paul S Albert¹, Lori E Dodd

Affiliations

PMID: 19802353
PMCID: PMC2755302
DOI: 10.1198/016214507000000329

On Estimating Diagnostic Accuracy From Studies With Multiple Raters and Partial Gold Standard Evaluation

Paul S Albert et al. J Am Stat Assoc. 2008.

. 2008 Mar 1;103(481):61-73.

doi: 10.1198/016214507000000329.

Authors

Paul S Albert¹, Lori E Dodd

Affiliation

¹ Biometric Branch, Division of Cancer Treatment and Diagnosis, National Cancer Institute, Bethesda, MD 20892 ( albertp@mail.nih.gov ).

PMID: 19802353
PMCID: PMC2755302
DOI: 10.1198/016214507000000329

Abstract

We are often interested in estimating sensitivity and specificity of a group of raters or a set of new diagnostic tests in situations in which gold standard evaluation is expensive or invasive. Numerous authors have proposed latent modeling approaches for estimating diagnostic error without a gold standard. Albert and Dodd showed that, when modeling without a gold standard, estimates of diagnostic error can be biased when the dependence structure between tests is misspecified. In addition, they showed that choosing between different models for this dependence structure is difficult in most practical situations. While these results caution against using these latent class models, the difficulties of obtaining gold standard verification remain a practical reality. We extend two classes of models to provide a compromise that collects gold standard information on a subset of subjects but incorporates information from both the verified and nonverified subjects during estimation. We examine the robustness of diagnostic error estimation with this approach and show that choosing between competing models is easier in this context. In our analytic work and simulations, we consider situations in which verification is completely at random as well as settings in which the probability of verification depends on the actual test results. We apply our methodological work to a study designed to estimate the diagnostic error of digital radiography for gastric cancer.

PubMed Disclaimer

Figures

**Figure 1**
Contour plot of relative asymptotic bias in sensitivity and specificity for 50% completely at random verification when the true model is a GRE model with *P_d* = .20, σ₀ = 1.5, σ₁ = 3, and J = 5. Relative asymptotic bias of sensitivity and specificity is defined as (*SENS** ′ *SENS*)/*SENS* (a) and (*SPEC** ′ *SPEC*)*/SPEC* (b), respectively. The contour plot was generated for sensitivities and specificities over an equally spaced grid ranging from .65 to .95 with 400 points.

**Figure 2**
Distribution of estimates of sensitivity using the FM and GRE model under completely random (CR) verification as well as under oversampling. Data were simulated under an FM model with J = 5, I = 1,000, *SENS* = .75, *SPEC* = .90, *P_d* = .2, η₁ = .5, and η₀ = .2. One thousand simulated realizations were obtained.

See this image and copyright information in PMC

References

1. Albert PS, Dodd LE. A Cautionary Note on the Robustness of Latent Class Models for Estimating Diagnostic Error Without a Gold Standard. Biometrics. 2004;60:427–435. - PubMed
1. Albert PS, McShane LM, Shih JH, et al. Latent Class Modeling Approaches for Assessing Diagnostic Error Without a Gold Standard: With Applications to p53 Immunohistochemical Assays in Bladder Tumors. Biometrics. 2001;57:610–619. - PubMed
1. Alonzo TA, Pepe M. Using a Combination of Reference Tests to Assess the Accuracy of a Diagnostic Test. Statistics in Medicine. 1999;18:2987–3003. - PubMed
1. Bahadur RR. A Representation of the Joint Distribution of Responses of n Dichotomous Items. In: Solomon H, editor. Studies in Item Analysis and Prediction. Stanford, CA: Stanford University Press; 1961. pp. 169–177.
1. Baker SG. Evaluating Multiple Diagnostic Tests With Partial Verification. Biometrics. 1995;51:330–337. - PubMed

Grants and funding

Z99 HD999999/ImNIH/Intramural NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

On Estimating Diagnostic Accuracy From Studies With Multiple Raters and Partial Gold Standard Evaluation

Affiliation

On Estimating Diagnostic Accuracy From Studies With Multiple Raters and Partial Gold Standard Evaluation

Authors

Affiliation

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources