Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Dec;253(3):822-30.
doi: 10.1148/radiol.2533081632. Epub 2009 Oct 28.

Experimental design and data analysis in receiver operating characteristic studies: lessons learned from reports in radiology from 1997 to 2006

Affiliations

Experimental design and data analysis in receiver operating characteristic studies: lessons learned from reports in radiology from 1997 to 2006

Junji Shiraishi et al. Radiology. 2009 Dec.

Abstract

Purpose: To provide a broad perspective concerning the recent use of receiver operating characteristic (ROC) analysis in medical imaging by reviewing ROC studies published in Radiology between 1997 and 2006 for experimental design, imaging modality, medical condition, and ROC paradigm.

Materials and methods: Two hundred ninety-five studies were obtained by conducting a literature search with PubMed with two criteria: publication in Radiology between 1997 and 2006 and occurrence of the phrase "receiver operating characteristic." Studies returned by the query that were not diagnostic imaging procedure performance evaluations were excluded. Characteristics of the remaining studies were tabulated.

Results: Two hundred thirty-three (79.0%) of the 295 studies reported findings based on observers' diagnostic judgments or objective measurements. Forty-three (14.6%) did not include human observers, with most of these reporting an evaluation of a computer-aided diagnosis system or functional data obtained with computed tomography (CT) or magnetic resonance (MR) imaging. The remaining 19 (6.4%) studies were classified as reviews or meta-analyses and were excluded from our subsequent analysis. Among the various imaging modalities, MR imaging (46.0%) and CT (25.7%) were investigated most frequently. Approximately 60% (144 of 233) of ROC studies with human observers published in Radiology included three or fewer observers.

Conclusion: ROC analysis is widely used in radiologic research, confirming its fundamental role in assessing diagnostic performance. However, the ROC studies reported in Radiology were not always adequate to support clear and clinically relevant conclusions.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Bar graph of imaging modalities used in ROC studies. Number to right of each bar is number of studies in which that modality was used. Sum of data does not equal number of studies because some studies used more than one modality. Denominator for percentages was number of studies (n = 276). CAD = computer-aided diagnosis, CR/DR = computed radiography/digital radiography, DSA = digital subtraction angiography, PACS = picture archiving and communication system, PET/SPECT = positron emission tomography/single photon emission computed tomography, US = ultrasonography.
Figure 2:
Figure 2:
Bar graph of number of observers in 233 subgroup A ROC studies.
Figure 3:
Figure 3:
Bar graph of number of cases (sum of those with positive findings and those with negative findings) in 233 subgroup A ROC studies.
Figure 4:
Figure 4:
Bar graph of data reported in 249 ROC studies. Percentages do not sum to 100 because some studies reported more than one output value. NPV** = negative predictive value, PPV* = positive predictive value.
Figure 5:
Figure 5:
Scatterplot of the relationship between differences in pairs of AUCs obtained when comparing two different treatments and AUC for the inferior treatment in each pair, with indication of finding significance and number of cases in the key. Two ROC studies (arrows) were performed by the same research group and employed a similar experimental design (phantom images as a clustered case sample; four observers; LABMRMC software used). Most likely cause of nonsignificant results is a small number of observers. Dashed line = theoretical maximum difference in AUC for the AUC for inferior treatment. Not SIG. = not significant, SIG. = significant.

References

    1. Green DM, Swets JA. Signal detection theory and psychophysics New York, NY: Krieger, 1974
    1. Lusted L. Introduction to medical decision making Springfield, Ill: Charles C Thomas, 1968
    1. Lusted LB. Signal detectability and medical decision-making. Science 1971;171:1217–1219 - PubMed
    1. Swets JA. Measuring the accuracy of diagnostic systems. Science 1988;240:1285–1293 - PubMed
    1. Goodenough DJ, Rossmann K, Lusted LB. Radiographic applications of receiver operating characteristic (ROC) curves. Radiology 1974;110:89–95 - PubMed

Publication types