Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Meta-Analysis
. 2014 Apr;21(4):481-90.
doi: 10.1016/j.acra.2013.12.011.

Comparative statistical properties of expected utility and area under the ROC curve for laboratory studies of observer performance in screening mammography

Affiliations
Meta-Analysis

Comparative statistical properties of expected utility and area under the ROC curve for laboratory studies of observer performance in screening mammography

Craig K Abbey et al. Acad Radiol. 2014 Apr.

Abstract

Rationale and objectives: Our objective is to determine whether expected utility (EU) and the area under the receiver operator characteristic (AUC) are consistent with one another as endpoints of observer performance studies in mammography. These two measures characterize receiver operator characteristic performance somewhat differently. We compare these two study endpoints at the level of individual reader effects, statistical inference, and components of variance across readers and cases.

Materials and methods: We reanalyze three previously published laboratory observer performance studies that investigate various x-ray breast imaging modalities using EU and AUC. The EU measure is based on recent estimates of relative utility for screening mammography.

Results: The AUC and EU measures are correlated across readers for individual modalities (r = 0.93) and differences in modalities (r = 0.94 to 0.98). Statistical inference for modality effects based on multi-reader multi-case analysis is very similar, with significant results (P < .05) in exactly the same conditions. Power analyses show mixed results across studies, with a small increase in power on average for EU that corresponds to approximately a 7% reduction in the number of readers. Despite a large number of crossing receiver operator characteristic curves (59% of readers), modality effects only rarely have opposite signs for EU and AUC (6%).

Conclusions: We do not find any evidence of systematic differences between EU and AUC in screening mammography observer studies. Thus, when utility approaches are viable (i.e., an appropriate value of relative utility exists), practical effects such as statistical efficiency may be used to choose study endpoints.

Keywords: Expected utility; area under the ROC curve; observer performance studies.

PubMed Disclaimer

Figures

Figure 1
Figure 1
This diagram shows how AUC and EU are determined for a given ROC curve. A smooth ROC curve is fitted to observed (hypothetical) data using the contaminated binormal model and maximum likelihood fitting. The area under the ROC curve (AUC) is depicted in gray. Under the assumption that task utilities result in iso-utility lines with a given slope, the y-intercept of the highest iso-utility line that intersects the ROC curve defines the expected utility (EU) measure. Note that the iso-utility line is tangent to the ROC curve at the optimal operating point.
Figure 2
Figure 2
Modality Effects. The AUC and EU figures of merit are shown for each reader and modality in the scatterplot of performance measures (A). Pairwise differences between modalities for each reader are shown for each of the three studies considered (B-D) with differences arranged so that the average difference across readers for any comparison is positive. In each study, the equation of the least-squares fitted line relating effect sizes is given.
Figure 3
Figure 3
Crossing ROC curves. The plot shows the fraction readers with crossing ROC curves in each study as well as the fraction of readers with modality differences in AUC and EU that have different (opposite) signs. Note that both of these are elevated in the DMIST studies where there is less of a modality effect.
Figure 4
Figure 4
The relative size of components of variance. Ratios of the elements of Table 2 are shown for each component of variance. The ratio is only shown for variance components greater than 0.0001. The three components directly related to modality comparisons in the MRMC design are indicated (*).
Figure 5
Figure 5
Power Analysis. Plots of statistical power based on the Hillis and Berbaum method [35] are plotted as a function of the number of readers for area under the ROC curve (A) and expected utility (B) figures of merit in each study (Legend in A applies to both plots). The number of cases used in the power calculation is the same as the number in the actual study. The effect size was set to the largest difference in reader averaged performance across modalities, except in the DMIST studies where a default of 0.1 was used for AUC and 0.136 was used for EU based on the regression line in Figure 2B. The number of readers needed to get 80% power (C) varies considerably from study to study. On average, EU results in a 7% reduction in the number of readers needed to achieve 80% power.

References

    1. Goodenough DJ, Rossmann K, Lusted LB. Radiographic applications of receiver operating characteristic (ROC) curves. Radiology. 1974;110(1):89–95. - PubMed
    1. Metz CE. ROC methodology in radiologic imaging. Invest Radiol. 1986;21(9):720–33. - PubMed
    1. Metz CE. ROC analysis in medical imaging: a tutorial review of the literature. Radiol Phys Technol. 2008;1(1):2–12. - PubMed
    1. Obuchowski NA. Receiver operating characteristic curves and their use in radiology. Radiology. 2003;229(1):3–8. - PubMed
    1. Shiraishi J, Pesce LL, Metz CE, Doi K. Experimental design and data analysis in receiver operating characteristic studies: lessons learned from reports in radiology from 1997 to 2006. Radiology. 2009;253(3):822–30. - PMC - PubMed

Publication types

MeSH terms