Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Apr;13(2):274-88.
doi: 10.1093/biostatistics/kxr044. Epub 2011 Nov 30.

A two-stage strategy to accommodate general patterns of confounding in the design of observational studies

Affiliations

A two-stage strategy to accommodate general patterns of confounding in the design of observational studies

Sebastien Haneuse et al. Biostatistics. 2012 Apr.

Abstract

Accommodating general patterns of confounding in sample size/power calculations for observational studies is extremely challenging, both technically and scientifically. While employing previously implemented sample size/power tools is appealing, they typically ignore important aspects of the design/data structure. In this paper, we show that sample size/power calculations that ignore confounding can be much more unreliable than is conventionally thought; using real data from the US state of North Carolina, naive calculations yield sample size estimates that are half those obtained when confounding is appropriately acknowledged. Unfortunately, eliciting realistic design parameters for confounding mechanisms is difficult. To overcome this, we propose a novel two-stage strategy for observational study design that can accommodate arbitrary patterns of confounding. At the first stage, researchers establish bounds for power that facilitate the decision of whether or not to initiate the study. At the second stage, internal pilot data are used to estimate key scientific inputs that can be used to obtain realistic sample size/power. Our results indicate that the strategy is effective at replicating gold standard calculations based on knowing the true confounding mechanism. Finally, we show that consideration of the nature of confounding is a crucial aspect of the elicitation process; depending on whether the confounder is positively or negatively associated with the exposure of interest and outcome, naive power calculations can either under or overestimate the required sample size. Throughout, simulation is advocated as the only general means to obtain realistic estimates of statistical power; we describe, and provide in an R package, a simple algorithm for estimating power for a case-control study.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Estimated power curves for detecting θx = 1.3 under a balanced case–control study, as a function of the case–control sample size n = n0 + n1. Each curve corresponds to a model that forms the basis for the power calculation (Sections 2.2 and 3.1). Estimates were obtained using the algorithm in the supplementary material (available at Biostatistics online) with R = 10000.
Fig. 2.
Fig. 2.
Estimated bounds for power to detect θx = 1.3, based on a case–control design, as a function of case–control sample size n for various scenarios for confounding. Estimates were obtained using the algorithm in the supplementary material (available at Biostatistics online) with R = 10000.
Fig. 3.
Fig. 3.
Results from four independent realizations of stage II, with pilot data sample sizes of m = 250, m = 500, and m = 1000. In each subfigure, power curves based on complete data (CD) for the unadjusted and fully adjusted models. Estimates were obtained using the algorithm in the supplementary material (available at Biostatistics online) with R = 10000.

Similar articles

Cited by

References

    1. Berry D. Interim analysis in clinical trials: the role of the likelihood principle. The American Statistician. 1987;41:117–122.
    1. Breslow N, Chatterjee N. Design and analysis of two-phase studies with binary outcomes applied to Wilms' tumor prognosis. Applied Statistics. 1999;48:457–468.
    1. Breslow N, Day N. Statistical Methods in Cancer Research, Vol. 1: The Analysis of Case-Control Studies. Lyon, France: IARC Scientific Publications; 1980. - PubMed
    1. Burington B, Emerson S. Flexible implementations of group sequential stopping rules using constrained boundaries. Biometrics. 2003;59:770–777. - PubMed
    1. Demidenko D. Sample size determination for logistic regression revisited. Statistics in Medicine. 2006;26:3385–3397. - PubMed

Publication types