Agreement of the order of overall performance levels under different reading paradigms

David Gur¹, Andriy I Bandos, Amy H Klym, Cathy S Cohen, Christiane M Hakim, Lara A Hardesty, Marie A Ganott, Ronald L Perrin, William R Poller, Ratan Shah, Jules H Sumkin, Luisa P Wallace, Howard E Rockette

Affiliations

PMID: 19000873
PMCID: PMC2601626
DOI: 10.1016/j.acra.2008.07.011

Agreement of the order of overall performance levels under different reading paradigms

David Gur et al. Acad Radiol. 2008 Dec.

. 2008 Dec;15(12):1567-73.

doi: 10.1016/j.acra.2008.07.011.

Authors

Affiliation

¹ Department of Radiology, University of Pittsburgh, 3362 Fifth Avenue, Pittsburgh, PA 15213-3180, USA. gurd@upmc.edu

PMID: 19000873
PMCID: PMC2601626
DOI: 10.1016/j.acra.2008.07.011

Abstract

Rationale and objectives: To investigate consistency of the orders of performance levels when interpreting mammograms under three different reading paradigms.

Materials and methods: We performed a retrospective observer study in which nine experienced radiologists rated an enriched set of mammography examinations that they personally had read in the clinic ("individualized") mixed with a set that none of them had read in the clinic ("common set"). Examinations were interpreted under three different reading paradigms: binary using screening Breast Imaging Reporting and Data System (BI-RADS), receiver-operating characteristic (ROC), and free-response ROC (FROC). The performance in discriminating between cancer and noncancer findings under each of the paradigms was summarized using Youden's index/2+0.5 (Binary), nonparameteric area under the ROC curve (AUC), and an overall FROC index (JAFROC-2). Pearson correlation coefficients were then computed to assess consistency in the ordering of observers' performance levels. Statistical significance of the computed correlation coefficients was assessed using bootstrap confidence intervals obtained by resampling sets of examination-specific observations.

Results: All but one of the computed pair-wise correlation coefficients were larger than 0.66 and were significantly different from zero. The correlation between the overall performance measures under the Binary and ROC paradigms was the lowest (0.43) and was not significantly different from zero (95% confidence interval -0.078 to 0.733).

Conclusion: The use of different evaluation paradigms in the laboratory tends to lead to consistent ordering of the overall performance levels of observers. However, one should recognize that conceptually similar performance indexes resulting from different paradigms often measure different performance characteristics and thus disagreements are not only possible but frequently quite natural.

PubMed Disclaimer

Figures

**Figure 1. An example of discordant ordering of overall performance measures under the Binary and ROC paradigms in the presence of perfect “agreement” of actual performances under the two paradigms**
The area under the solid ROC curve (a) is 0.82. The point on this curve (b) has coordinates (0.35, 0.85) corresponding to a Youden’s index+1)/2 of 0.75. The area under the dashed ROC curve (c) is 0.85. The point on this curve (d) has coordinates (0.14, 0.60) which correspond to the (Youden’s index+1)/2 of 0.73.

See this image and copyright information in PMC

References

1. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the area under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837–845. - PubMed
1. Dorfman DD, Berbaum KS, Metz CE. Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. Invest Radiol. 1992;27(9):723–731. - PubMed
1. Obuchowski NA, Rockette HE. Hypothesis testing of the diagnostic accuracy for multiple diagnostic tests: an ANOVA approach with dependent observations. Communications Statistics Simulations Computations. 1995;24:285–308.
1. Beiden SV, Wagner RF, Campbell G. Components of variance models and multiple bootstrap experiments: An alternative method for random effects, receiver operating characteristics analysis. Acad Radiol. 2000;7:341–349. - PubMed
1. Ishwaran H, Gatsonis CA. A general class of hierarchical ordinal regression models with applications to correlated ROC analysis. The Canadian Journal of Statistics. 2000;28:731–750.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

R01 EB003503/EB/NIBIB NIH HHS/United States

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Agreement of the order of overall performance levels under different reading paradigms

Affiliation

Agreement of the order of overall performance levels under different reading paradigms

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical