Experimental design and data analysis in receiver operating characteristic studies: lessons learned from reports in radiology from 1997 to 2006

Junji Shiraishi¹, Lorenzo L Pesce, Charles E Metz, Kunio Doi

Affiliations

PMID: 19864510
PMCID: PMC2786192
DOI: 10.1148/radiol.2533081632

Experimental design and data analysis in receiver operating characteristic studies: lessons learned from reports in radiology from 1997 to 2006

Junji Shiraishi et al. Radiology. 2009 Dec.

. 2009 Dec;253(3):822-30.

doi: 10.1148/radiol.2533081632. Epub 2009 Oct 28.

Authors

Junji Shiraishi¹, Lorenzo L Pesce, Charles E Metz, Kunio Doi

Affiliation

¹ Kurt Rossmann Laboratories for Radiologic Image Research, Department of Radiology, University of Chicago, Chicago, IL, USA. j2s@kumamoto-u.ac.jp

PMID: 19864510
PMCID: PMC2786192
DOI: 10.1148/radiol.2533081632

Abstract

Purpose: To provide a broad perspective concerning the recent use of receiver operating characteristic (ROC) analysis in medical imaging by reviewing ROC studies published in Radiology between 1997 and 2006 for experimental design, imaging modality, medical condition, and ROC paradigm.

Materials and methods: Two hundred ninety-five studies were obtained by conducting a literature search with PubMed with two criteria: publication in Radiology between 1997 and 2006 and occurrence of the phrase "receiver operating characteristic." Studies returned by the query that were not diagnostic imaging procedure performance evaluations were excluded. Characteristics of the remaining studies were tabulated.

Results: Two hundred thirty-three (79.0%) of the 295 studies reported findings based on observers' diagnostic judgments or objective measurements. Forty-three (14.6%) did not include human observers, with most of these reporting an evaluation of a computer-aided diagnosis system or functional data obtained with computed tomography (CT) or magnetic resonance (MR) imaging. The remaining 19 (6.4%) studies were classified as reviews or meta-analyses and were excluded from our subsequent analysis. Among the various imaging modalities, MR imaging (46.0%) and CT (25.7%) were investigated most frequently. Approximately 60% (144 of 233) of ROC studies with human observers published in Radiology included three or fewer observers.

Conclusion: ROC analysis is widely used in radiologic research, confirming its fundamental role in assessing diagnostic performance. However, the ROC studies reported in Radiology were not always adequate to support clear and clinically relevant conclusions.

PubMed Disclaimer

Figures

**Figure 1:**
Bar graph of imaging modalities used in ROC studies. Number to right of each bar is number of studies in which that modality was used. Sum of data does not equal number of studies because some studies used more than one modality. Denominator for percentages was number of studies (n = 276). *CAD* = computer-aided diagnosis, *CR/DR* = computed radiography/digital radiography, *DSA* = digital subtraction angiography, *PACS* = picture archiving and communication system, *PET/SPECT* = positron emission tomography/single photon emission computed tomography, US = ultrasonography.

**Figure 2:**
Bar graph of number of observers in 233 subgroup A ROC studies.

**Figure 3:**
Bar graph of number of cases (sum of those with positive findings and those with negative findings) in 233 subgroup A ROC studies.

**Figure 4:**
Bar graph of data reported in 249 ROC studies. Percentages do not sum to 100 because some studies reported more than one output value. *NPV*** = negative predictive value, *PPV** = positive predictive value.

**Figure 5:**
Scatterplot of the relationship between differences in pairs of AUCs obtained when comparing two different treatments and AUC for the inferior treatment in each pair, with indication of finding significance and number of cases in the key. Two ROC studies (arrows) were performed by the same research group and employed a similar experimental design (phantom images as a clustered case sample; four observers; LABMRMC software used). Most likely cause of nonsignificant results is a small number of observers. Dashed line = theoretical maximum difference in AUC for the AUC for inferior treatment. *Not SIG*. = not significant, *SIG*. = significant.

See this image and copyright information in PMC

References

1. Green DM, Swets JA. Signal detection theory and psychophysics New York, NY: Krieger, 1974
1. Lusted L. Introduction to medical decision making Springfield, Ill: Charles C Thomas, 1968
1. Lusted LB. Signal detectability and medical decision-making. Science 1971;171:1217–1219 - PubMed
1. Swets JA. Measuring the accuracy of diagnostic systems. Science 1988;240:1285–1293 - PubMed
1. Goodenough DJ, Rossmann K, Lusted LB. Radiographic applications of receiver operating characteristic (ROC) curves. Radiology 1974;110:89–95 - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Experimental design and data analysis in receiver operating characteristic studies: lessons learned from reports in radiology from 1997 to 2006

Affiliation

Experimental design and data analysis in receiver operating characteristic studies: lessons learned from reports in radiology from 1997 to 2006

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical