Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Aug;87(1040):20140016.
doi: 10.1259/bjr.20140016. Epub 2014 Jun 2.

The average receiver operating characteristic curve in multireader multicase imaging studies

Affiliations

The average receiver operating characteristic curve in multireader multicase imaging studies

W Chen et al. Br J Radiol. 2014 Aug.

Abstract

Objective: In multireader, multicase (MRMC) receiver operating characteristic (ROC) studies for evaluating medical imaging systems, the area under the ROC curve (AUC) is often used as a summary metric. Owing to the limitations of AUC, plotting the average ROC curve to accompany the rigorous statistical inference on AUC is recommended. The objective of this article is to investigate methods for generating the average ROC curve from ROC curves of individual readers.

Methods: We present both a non-parametric method and a parametric method for averaging ROC curves that produce a ROC curve, the area under which is equal to the average AUC of individual readers (a property we call area preserving). We use hypothetical examples, simulated data and a real-world imaging data set to illustrate these methods and their properties.

Results: We show that our proposed methods are area preserving. We also show that the method of averaging the ROC parameters, either the conventional bi-normal parameters (a, b) or the proper bi-normal parameters (c, da), is generally not area preserving and may produce a ROC curve that is intuitively not an average of multiple curves.

Conclusion: Our proposed methods are useful for making plots of average ROC curves in MRMC studies as a companion to the rigorous statistical inference on the AUC end point. The software implementing these methods is freely available from the authors.

Advances in knowledge: METHODS for generating the average ROC curve in MRMC ROC studies are formally investigated. The area-preserving criterion we defined is useful to evaluate such methods.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Graphs of the dependence of area under the receiver operating characteristic curve [area under the curve (AUC)] on model parameters. (a) The dependence of the AUC of a bi-normal model on parameter a. (b) The dependence of the AUC of a proper bi-normal model on parameter c. Some portions of these curves are well approximated as linear, but, in general, they are non-linear. AUCCvB, area under the conventional receiver operating characteristic curve; AUCproper, area under the proper receiver operating characteristic curve.
Figure 2.
Figure 2.
Graphs of three receiver operating characteristic curves (dotted lines), and three different kinds of non-parametric averages of those curves (solid lines). Different types of non-parametric averaging lead to different shapes of average curves, all with the same area. AUC, area under the curve; Se, sensitivity; Sp, specificity.
Figure 3.
Figure 3.
Graphs of three receiver operating characteristic curves (dotted lines), and two different kinds of parametric averages of those curves. Averaging model parameters (bold dotted line) gives a different curve than the proposed method (solid line), which preserves the average area under the curve (AUC).
Figure 4.
Figure 4.
Graphs of six receiver operating characteristic (ROC) curves (solid lines) and two possible average ROC curves. The dashed line is the ROC curve obtained from averaging model parameters c and da. The dash-dotted curve is a non-parametric average, which preserves area. AUC, area under the curve; Se, sensitivity.
Figure 5.
Figure 5.
Non-parametric average receiver operating characteristic (ROC) curves. These curves are averages of 15 simulated ROC curves (the parameters are listed in Table 1). (a) Non-parametric averages of non-parametric ROC curve estimates. (b) Non-parametric averages of parametric ROC curve estimates. In these examples, the different non-parametric averaging methods give very similar results for both simulated imaging modalities. Se, sensitivity; Sp, specificity.
Figure 6.
Figure 6.
Receiver operating characteristic (ROC) curves averaged from 15 simulated ROC curves (the parameters are listed in Table 1). Each simulated modality is averaged with non-parametric and parametric averaging [averaging the (c, da) parameters]. For modality 1, the parametric and non-parametric averaging give nearly identical results, and for modality 2, they give very different results. Se, sensitivity; Sp, specificity.
Figure 7.
Figure 7.
Receiver operating characteristic (ROC) curves from simulated imaging modality 2 (grey lines) and averages of those curves (black lines). The non-parametric averaging gives a far more reasonable average curve than does the method of averaging ROC model parameters, which falls well below most of the individual curves. Se, sensitivity; Sp, specificity.
Figure 8.
Figure 8.
Receiver operating characteristic (ROC) curves and averaging for a real-world data set: radiologists read cine MRI images. (a) Non-parametric average (thick solid line) of empirical ROC curves of five radiologists (thin lines). (b) Non-parametric and parametric averages of parametric ROC curve estimates. Se, sensitivity; Sp, specificity.

References

    1. Gallas BD, Chan HP, D'Orsi CJ, Dodd LE, Giger ML, Gur D, et al. Evaluating imaging and computer-aided detection and diagnosis devices at the FDA. Acad Radiol 2012; 19: 463–77. doi: 10.1016/j.acra.2011.12.016 - DOI - PMC - PubMed
    1. Metz CE. Basic principles of ROC analysis. Semin Nucl Med 1978; 8: 283–98. - PubMed
    1. Samuelson FW. Inference based on diagnostic measures from studies of new imaging devices. Acad Radiol 2013; 20: 816–24. doi: 10.1016/j.acra.2013.03.002 - DOI - PubMed
    1. Dorfman DD, Berbaum KS, Metz CE. Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. Invest Radiol 1992; 27: 723–31. - PubMed
    1. Wagner RF, Metz CE, Campbell G. Assessment of medical imaging systems and computer aids: a tutorial review. Acad Radiol 2007; 14: 723–48. doi: 10.1016/j.acra.2007.03.001 - DOI - PubMed