Optimal Designs of Two-Phase Studies
- PMID: 33716361
- PMCID: PMC7954143
- DOI: 10.1080/01621459.2019.1671200
Optimal Designs of Two-Phase Studies
Abstract
The two-phase design is a cost-effective sampling strategy to evaluate the effects of covariates on an outcome when certain covariates are too expensive to be measured on all study subjects. Under such a design, the outcome and inexpensive covariates are measured on all subjects in the first phase and the first-phase information is used to select subjects for measurements of expensive covariates in the second phase. Previous research on two-phase studies has focused largely on the inference procedures rather than the design aspects. We investigate the design efficiency of the two-phase study, as measured by the semiparametric efficiency bound for estimating the regression coefficients of expensive covariates. We consider general two-phase studies, where the outcome variable can be continuous, discrete, or censored, and the second-phase sampling can depend on the first-phase data in any manner. We develop optimal or approximately optimal two-phase designs, which can be substantially more efficient than the existing designs. We demonstrate the improvements of the new designs over the existing ones through extensive simulation studies and two large medical studies.
Keywords: Case-cohort design; Case-control study; Generalized linear models; Outcome-dependent sampling; Proportional hazards; Semiparametric efficiency.
Figures
References
-
- Bickel PJ, Klaassen CAJ, Ritov Y, and Wellner JA (1998), Efficient and Adaptive Estimation for Semiparametric Models, New York: Springer-Verlag.
-
- Borgan Ø, Langholz B, Samuelsen SO, Goldstein L, and Pogoda J (2000), “Exposure Stratified Case-Cohort Designs,” Lifetime Data Analysis, 6, 39–58. - PubMed
-
- Breslow NE and Cain KC (1988), “Logistic Regression for Two-Stage Case-Control Data,” Biometrika, 75, 11–20.
-
- Breslow NE and Chatterjee N (1999), “Design and Analysis of Two-Phase Studies with Binary Outcome Applied to Wilms Tumour Prognosis,” Journal of the Royal Statistical Society, Series C, 48, 457–468.
-
- Breslow NE and Holubkov R (1997), “Maximum Likelihood Estimation of Logistic Regression Parameters Under Two-Phase, Outcome-Dependent Sampling,” Journal of the Royal Statistical Society, Series B, 59, 447–461.
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources