Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2000 Apr 15;47(8):762-6.
doi: 10.1016/s0006-3223(00)00837-4.

Penny-wise and pound-foolish: the impact of measurement error on sample size requirements in clinical trials

Affiliations
Comparative Study

Penny-wise and pound-foolish: the impact of measurement error on sample size requirements in clinical trials

D O Perkins et al. Biol Psychiatry. .

Abstract

Background: Clinical research studies must compensate for measurement error by increasing the number of subjects that are studied, thereby increasing the financial costs of research and exposing greater numbers of subjects to study risks. In this article, we model the relationship between reliability and sample-size requirements and consider the potential tangible cost savings resulting from the decreased number of subjects needed when reliability of raters is improved or multiple ratings are used.

Methods: Standard methods are used to model reliability based on the intraclass correlation coefficient (R) and to perform power calculations. The impact of multiple raters on reliability for a given baseline level of reliability is modeled according to the Spearman Brown formula.

Results: Our models demonstrate that meaningful reductions in sample size requirements are gained from improvements in reliability. For example, improving reliability from R = .7 to R = .9 will decreases sample size requirements by 22%. Reliability is improved by training and by the use of the mean of multiple ratings. For example, if the reliability of a single rating is 0.7, the reliability of the mean of two ratings will be 0.8.

Conclusions: The costs to improve reliability either through rater training efforts or use of the mean of multiple ratings is cost effective because of the consequent reduction in number of subjects needed. Efforts to improve reliability and thus reduce subject requirements in a study also may lead to fewer patients bearing the burden of research participation and to a shortening of the duration of studies.

PubMed Disclaimer

Publication types

MeSH terms

LinkOut - more resources