Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb 20;37(4):557-571.
doi: 10.1002/sim.7530. Epub 2017 Nov 2.

Modeling rater diagnostic skills in binary classification processes

Affiliations

Modeling rater diagnostic skills in binary classification processes

Xiaoyan Lin et al. Stat Med. .

Abstract

Many disease diagnoses involve subjective judgments by qualified raters. For example, through the inspection of a mammogram, MRI, or ultrasound image, the clinician himself becomes part of the measuring instrument. To reduce diagnostic errors and improve the quality of diagnoses, it is necessary to assess raters' diagnostic skills and to improve their skills over time. This paper focuses on a subjective binary classification process, proposing a hierarchical model linking data on rater opinions with patient true disease-development outcomes. The model allows for the quantification of the effects of rater diagnostic skills (bias and magnifier) and patient latent disease severity on the rating results. A Bayesian Markov chain Monte Carlo (MCMC) algorithm is developed to estimate these parameters. Linking to patient true disease outcomes, the rater-specific sensitivity and specificity can be estimated using MCMC samples. Cost theory is used to identify poor- and strong-performing raters and to guide adjustment of rater bias and diagnostic magnifier to improve the rating performance. Furthermore, diagnostic magnifier is shown as a key parameter to present a rater's diagnostic ability because a rater with a larger diagnostic magnifier has a uniformly better receiver operating characteristic (ROC) curve when varying the value of diagnostic bias. A simulation study is conducted to evaluate the proposed methods, and the methods are illustrated with a mammography example.

Keywords: ROC; cost theory; diagnostic bias; diagnostic magnifier; disease severity.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The mammography data: 107 radiologists and 146 mammograms. The main part of the plot shows the proportion of positive ratings for each radiologist in an ascending order. The right bar shows the true proportion of positive mammograms.
Figure 2
Figure 2
Comparison of ROC curves for different values of diagnostic magnifier b when varying the value of diagnostic bias a. A two component normal mixture distribution F(u) = pΦ(u; −2, 1) + (1−p)Φ(u; 2, 1) is adopted for calculating the sensitivity and specificity, where p = 0.8 and 0.5 for the left and right panels, respectively.
Figure 3
Figure 3
Simulation study: plot of point estimates and 95% credible intervals of rater bias aj and diagnostic magnifier bj for the raters. True values of aj and bj for each rater are plotted as ×. Left panel is for the sample size m = n = 50; right panel is for the sample size m = 150 and n = 100.
Figure 4
Figure 4
Simulation study: plot of point estimates of sensitivity and specificity vs. true values of sensitivity and specificity. Left panel is for the sample size m = n = 50; right panel is for the sample size m = 150 and n = 100.
Figure 5
Figure 5
Simulation study: estimation of the distribution of patient latent disease severity u. The dotted line is the true density of the generated ui; the dashed line is the estimated density for the sample size m = n = 50; the dotdashed line is the estimated density for the sample size m = 150 and n = 100.
Figure 6
Figure 6
The mammography data: estimation of diagnostic bias aj and diagnostic magnifier bj for the 107 radiologists. The two horizontal lines show the mean of estimated aj’s and bj’s, respectively.
Figure 7
Figure 7
The mammography data: estimation of the distribution of patient latent disease severity u. The estimated distributions are drawn separately for the patients with D = 0 (no disease) and for the patients with D = 1 (disease). The reference line is the standard normal density, used as the prior for u in the estimation algorithm.

References

    1. Kruskal J, Eisenberg R. Focused professional performance evaluation of a radiologist - a centers for medicare and medicaid services and joint commission requirement. Current Problems in Diagnostic Radiology. 2015 doi: 10.1067/j.cpradiol.2015.08.006. pii: S0363-0188(15)00132-2. - DOI - PubMed
    1. Bamber D. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of Mathematical Psychology. 1975;12:387–415.
    1. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29–36. - PubMed
    1. Mossman D. Three-way ROCs. Medical Decision Making. 1999;19:78–89. - PubMed
    1. Obuchowski NA, Goske MJ, Applegate KE. Assessing physicians’ accuracy in diagnosing pediatric patients with acute abdominal pain: measuring accuracy for multiple disease. Statistics in Medicine. 2001;20:3261–3278. - PubMed

Publication types

LinkOut - more resources