Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul;20(7):798-806.
doi: 10.1016/j.acra.2013.02.008. Epub 2013 Apr 20.

Statistical power considerations for a utility endpoint in observer performance studies

Affiliations

Statistical power considerations for a utility endpoint in observer performance studies

Craig K Abbey et al. Acad Radiol. 2013 Jul.

Abstract

Rationale and objectives: The purpose of this investigation is to compare the statistical power of the most common measure of performance for observer performance studies, area under the ROC curve (AUC), to an expected utility (EU) endpoint.

Materials and methods: We have modified a well-known simulation procedure developed by Roe and Metz for statistical power analysis in receiver operating characteristic (ROC) studies. Starting from a set of baseline simulations, we investigate the effects of three parameters that describe properties of the observers (iso-utility slope, unequal variance, and tendency to favor more aggressive or conservative actions) and three parameters that affect experimental design (number of readers, number of cases, and fraction of positive cases).

Results: The EU endpoint generally has good statistical power relative to AUC in our simulations. Of 396 total conditions simulated, EU had higher statistical power in 377 cases (95%). In 246 of these cases, EU power was 5 percentage points or more higher than AUC. In simulation runs evaluating the effect of the number of readers and cases on the baseline simulations, EU measure had equivalent power to AUC with fewer readers (9% to 28%) or fewer cases (18% to 41%).

Conclusion: These simulation studies provide further motivation for considering EU in studies of screening mammography technology and they motivate investigations of utility in other diagnostic tasks.

PubMed Disclaimer

LinkOut - more resources