. 2014 Aug;87(1040):20140016.

doi: 10.1259/bjr.20140016. Epub 2014 Jun 2.

The average receiver operating characteristic curve in multireader multicase imaging studies

W Chen¹, F W Samuelson

Affiliations

Affiliation

¹ Division of Imaging and Applied Mathematics, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, MD, USA.

PMID: 24884728
PMCID: PMC4112395
DOI: 10.1259/bjr.20140016

The average receiver operating characteristic curve in multireader multicase imaging studies

W Chen et al. Br J Radiol. 2014 Aug.

. 2014 Aug;87(1040):20140016.

doi: 10.1259/bjr.20140016. Epub 2014 Jun 2.

Authors

W Chen¹, F W Samuelson

Affiliation

¹ Division of Imaging and Applied Mathematics, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, MD, USA.

PMID: 24884728
PMCID: PMC4112395
DOI: 10.1259/bjr.20140016

Abstract

Objective: In multireader, multicase (MRMC) receiver operating characteristic (ROC) studies for evaluating medical imaging systems, the area under the ROC curve (AUC) is often used as a summary metric. Owing to the limitations of AUC, plotting the average ROC curve to accompany the rigorous statistical inference on AUC is recommended. The objective of this article is to investigate methods for generating the average ROC curve from ROC curves of individual readers.

Methods: We present both a non-parametric method and a parametric method for averaging ROC curves that produce a ROC curve, the area under which is equal to the average AUC of individual readers (a property we call area preserving). We use hypothetical examples, simulated data and a real-world imaging data set to illustrate these methods and their properties.

Results: We show that our proposed methods are area preserving. We also show that the method of averaging the ROC parameters, either the conventional bi-normal parameters (a, b) or the proper bi-normal parameters (c, da), is generally not area preserving and may produce a ROC curve that is intuitively not an average of multiple curves.

Conclusion: Our proposed methods are useful for making plots of average ROC curves in MRMC studies as a companion to the rigorous statistical inference on the AUC end point. The software implementing these methods is freely available from the authors.

Advances in knowledge: METHODS for generating the average ROC curve in MRMC ROC studies are formally investigated. The area-preserving criterion we defined is useful to evaluate such methods.

PubMed Disclaimer

Figures

**Figure 1.**
Graphs of the dependence of area under the receiver operating characteristic curve [area under the curve (AUC)] on model parameters. (a) The dependence of the AUC of a bi-normal model on parameter a. (b) The dependence of the AUC of a proper bi-normal model on parameter c. Some portions of these curves are well approximated as linear, but, in general, they are non-linear. AUC_CvB, area under the conventional receiver operating characteristic curve; AUC_proper, area under the proper receiver operating characteristic curve.

**Figure 2.**
Graphs of three receiver operating characteristic curves (dotted lines), and three different kinds of non-parametric averages of those curves (solid lines). Different types of non-parametric averaging lead to different shapes of average curves, all with the same area. AUC, area under the curve; S_e, sensitivity; S_p, specificity.

**Figure 3.**
Graphs of three receiver operating characteristic curves (dotted lines), and two different kinds of parametric averages of those curves. Averaging model parameters (bold dotted line) gives a different curve than the proposed method (solid line), which preserves the average area under the curve (AUC).

**Figure 4.**
Graphs of six receiver operating characteristic (ROC) curves (solid lines) and two possible average ROC curves. The dashed line is the ROC curve obtained from averaging model parameters c and d_a. The dash-dotted curve is a non-parametric average, which preserves area. AUC, area under the curve; S_e, sensitivity.

**Figure 5.**
Non-parametric average receiver operating characteristic (ROC) curves. These curves are averages of 15 simulated ROC curves (the parameters are listed in Table 1). (a) Non-parametric averages of non-parametric ROC curve estimates. (b) Non-parametric averages of parametric ROC curve estimates. In these examples, the different non-parametric averaging methods give very similar results for both simulated imaging modalities. S_e, sensitivity; S_p, specificity.

**Figure 6.**
Receiver operating characteristic (ROC) curves averaged from 15 simulated ROC curves (the parameters are listed in Table 1). Each simulated modality is averaged with non-parametric and parametric averaging [averaging the (c, d_a) parameters]. For modality 1, the parametric and non-parametric averaging give nearly identical results, and for modality 2, they give very different results. S_e, sensitivity; S_p, specificity.

**Figure 7.**
Receiver operating characteristic (ROC) curves from simulated imaging modality 2 (grey lines) and averages of those curves (black lines). The non-parametric averaging gives a far more reasonable average curve than does the method of averaging ROC model parameters, which falls well below most of the individual curves. S_e, sensitivity; S_p, specificity.

**Figure 8.**
Receiver operating characteristic (ROC) curves and averaging for a real-world data set: radiologists read cine MRI images. (a) Non-parametric average (thick solid line) of empirical ROC curves of five radiologists (thin lines). (b) Non-parametric and parametric averages of parametric ROC curve estimates. S_e, sensitivity; S_p, specificity.

See this image and copyright information in PMC

References

1. Gallas BD, Chan HP, D'Orsi CJ, Dodd LE, Giger ML, Gur D, et al. Evaluating imaging and computer-aided detection and diagnosis devices at the FDA. Acad Radiol 2012; 19: 463–77. doi: 10.1016/j.acra.2011.12.016 - DOI - PMC - PubMed
1. Metz CE. Basic principles of ROC analysis. Semin Nucl Med 1978; 8: 283–98. - PubMed
1. Samuelson FW. Inference based on diagnostic measures from studies of new imaging devices. Acad Radiol 2013; 20: 816–24. doi: 10.1016/j.acra.2013.03.002 - DOI - PubMed
1. Dorfman DD, Berbaum KS, Metz CE. Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. Invest Radiol 1992; 27: 723–31. - PubMed
1. Wagner RF, Metz CE, Campbell G. Assessment of medical imaging systems and computer aids: a tutorial review. Acad Radiol 2007; 14: 723–48. doi: 10.1016/j.acra.2007.03.001 - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Medical
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The average receiver operating characteristic curve in multireader multicase imaging studies

Affiliation

The average receiver operating characteristic curve in multireader multicase imaging studies

Authors

Affiliation

Abstract

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Miscellaneous