A two-stage strategy to accommodate general patterns of confounding in the design of observational studies
- PMID: 22130627
- PMCID: PMC3297823
- DOI: 10.1093/biostatistics/kxr044
A two-stage strategy to accommodate general patterns of confounding in the design of observational studies
Abstract
Accommodating general patterns of confounding in sample size/power calculations for observational studies is extremely challenging, both technically and scientifically. While employing previously implemented sample size/power tools is appealing, they typically ignore important aspects of the design/data structure. In this paper, we show that sample size/power calculations that ignore confounding can be much more unreliable than is conventionally thought; using real data from the US state of North Carolina, naive calculations yield sample size estimates that are half those obtained when confounding is appropriately acknowledged. Unfortunately, eliciting realistic design parameters for confounding mechanisms is difficult. To overcome this, we propose a novel two-stage strategy for observational study design that can accommodate arbitrary patterns of confounding. At the first stage, researchers establish bounds for power that facilitate the decision of whether or not to initiate the study. At the second stage, internal pilot data are used to estimate key scientific inputs that can be used to obtain realistic sample size/power. Our results indicate that the strategy is effective at replicating gold standard calculations based on knowing the true confounding mechanism. Finally, we show that consideration of the nature of confounding is a crucial aspect of the elicitation process; depending on whether the confounder is positively or negatively associated with the exposure of interest and outcome, naive power calculations can either under or overestimate the required sample size. Throughout, simulation is advocated as the only general means to obtain realistic estimates of statistical power; we describe, and provide in an R package, a simple algorithm for estimating power for a case-control study.
Figures



Similar articles
-
Two-stage sampling for etiologic studies. Sample size and power.Am J Epidemiol. 1997 Sep 1;146(5):450-8. doi: 10.1093/oxfordjournals.aje.a009298. Am J Epidemiol. 1997. PMID: 9290505
-
Power and sample size for observational studies of point exposure effects.Biometrics. 2022 Mar;78(1):388-398. doi: 10.1111/biom.13405. Epub 2020 Dec 18. Biometrics. 2022. PMID: 33226116 Free PMC article.
-
Tutorial on Biostatistics: Sample Size and Power Calculation for Ophthalmic Studies With Correlated Binary Eye Outcomes.Invest Ophthalmol Vis Sci. 2024 Jul 1;65(8):7. doi: 10.1167/iovs.65.8.7. Invest Ophthalmol Vis Sci. 2024. PMID: 38958969 Free PMC article.
-
Statistical power and sample size calculations: A primer for pediatric surgeons.J Pediatr Surg. 2020 Jul;55(7):1173-1179. doi: 10.1016/j.jpedsurg.2019.05.007. Epub 2019 May 16. J Pediatr Surg. 2020. PMID: 31155391 Review.
-
Power calculations in genetic studies.Cold Spring Harb Protoc. 2012 Jun 1;2012(6):664-74. doi: 10.1101/pdb.top069559. Cold Spring Harb Protoc. 2012. PMID: 22661434 Review.
Cited by
-
Practical strategies for operationalizing optimal allocation in stratified cluster-based outcome-dependent sampling designs.Stat Med. 2023 Mar 30;42(7):917-935. doi: 10.1002/sim.9650. Epub 2023 Jan 17. Stat Med. 2023. PMID: 36650619 Free PMC article.
-
A two-stage hidden Markov model design for biomarker detection, with application to microbiome research.Stat Biosci. 2018 Apr;10(1):41-58. doi: 10.1007/s12561-017-9187-y. Epub 2017 Feb 10. Stat Biosci. 2018. PMID: 30174757 Free PMC article.
-
Power and sample size for multivariate logistic modeling of unmatched case-control studies.Stat Methods Med Res. 2019 Mar;28(3):822-834. doi: 10.1177/0962280217737157. Epub 2017 Nov 16. Stat Methods Med Res. 2019. PMID: 29145780 Free PMC article.
-
Sample size and power determination when limited preliminary information is available.BMC Med Res Methodol. 2017 Apr 26;17(1):75. doi: 10.1186/s12874-017-0329-1. BMC Med Res Methodol. 2017. PMID: 28446127 Free PMC article.
-
Optimal allocation in stratified cluster-based outcome-dependent sampling designs.Stat Med. 2021 Aug 15;40(18):4090-4107. doi: 10.1002/sim.9016. Epub 2021 Jun 2. Stat Med. 2021. PMID: 34076912 Free PMC article.
References
-
- Berry D. Interim analysis in clinical trials: the role of the likelihood principle. The American Statistician. 1987;41:117–122.
-
- Breslow N, Chatterjee N. Design and analysis of two-phase studies with binary outcomes applied to Wilms' tumor prognosis. Applied Statistics. 1999;48:457–468.
-
- Breslow N, Day N. Statistical Methods in Cancer Research, Vol. 1: The Analysis of Case-Control Studies. Lyon, France: IARC Scientific Publications; 1980. - PubMed
-
- Burington B, Emerson S. Flexible implementations of group sequential stopping rules using constrained boundaries. Biometrics. 2003;59:770–777. - PubMed
-
- Demidenko D. Sample size determination for logistic regression revisited. Statistics in Medicine. 2006;26:3385–3397. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Molecular Biology Databases