Improved Horvitz-Thompson Estimation of Model Parameters from Two-phase Stratified Samples: Applications in Epidemiology
- PMID: 20174455
- PMCID: PMC2822363
- DOI: 10.1007/s12561-009-9001-6
Improved Horvitz-Thompson Estimation of Model Parameters from Two-phase Stratified Samples: Applications in Epidemiology
Abstract
The case-cohort study involves two-phase sampling: simple random sampling from an infinite super-population at phase one and stratified random sampling from a finite cohort at phase two. Standard analyses of case-cohort data involve solution of inverse probability weighted (IPW) estimating equations, with weights determined by the known phase two sampling fractions. The variance of parameter estimates in (semi)parametric models, including the Cox model, is the sum of two terms: (i) the model based variance of the usual estimates that would be calculated if full data were available for the entire cohort; and (ii) the design based variance from IPW estimation of the unknown cohort total of the efficient influence function (IF) contributions. This second variance component may be reduced by adjusting the sampling weights, either by calibration to known cohort totals of auxiliary variables correlated with the IF contributions or by their estimation using these same auxiliary variables. Both adjustment methods are implemented in the R survey package. We derive the limit laws of coefficients estimated using adjusted weights. The asymptotic results suggest practical methods for construction of auxiliary variables that are evaluated by simulation of case-cohort samples from the National Wilms Tumor Study and by log-linear modeling of case-cohort data from the Atherosclerosis Risk in Communities Study. Although not semiparametric efficient, estimators based on adjusted weights may come close to achieving full efficiency within the class of augmented IPW estimators.
Similar articles
-
Optimal sampling for design-based estimators of regression models.Stat Med. 2022 Apr 15;41(8):1482-1497. doi: 10.1002/sim.9300. Epub 2022 Jan 6. Stat Med. 2022. PMID: 34989429 Free PMC article.
-
Estimation in the semiparametric accelerated failure time model with missing covariates: improving efficiency through augmentation.J Am Stat Assoc. 2017;112(519):1221-1235. doi: 10.1080/01621459.2016.1205500. Epub 2017 Apr 25. J Am Stat Assoc. 2017. PMID: 33033419 Free PMC article.
-
Analysis of two-phase sampling data with semiparametric additive hazards models.Lifetime Data Anal. 2017 Jul;23(3):377-399. doi: 10.1007/s10985-016-9363-2. Epub 2016 Mar 19. Lifetime Data Anal. 2017. PMID: 26995733 Free PMC article.
-
Analysis of case-cohort designs with binary outcomes: Improving efficiency using whole-cohort auxiliary information.Stat Methods Med Res. 2017 Apr;26(2):691-706. doi: 10.1177/0962280214556175. Epub 2014 Oct 26. Stat Methods Med Res. 2017. PMID: 25348675 Review.
-
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
Cited by
-
Efficient risk-based collection of biospecimens in cohort studies: designing a prospective study of diagnostic performance for multicancer detection tests.Am J Epidemiol. 2025 Jan 8;194(1):243-253. doi: 10.1093/aje/kwae139. Am J Epidemiol. 2025. PMID: 38965750 Free PMC article.
-
Using the whole cohort in the analysis of case-cohort data.Am J Epidemiol. 2009 Jun 1;169(11):1398-405. doi: 10.1093/aje/kwp055. Epub 2009 Apr 8. Am J Epidemiol. 2009. PMID: 19357328 Free PMC article.
-
Pregnancies among women living with HIV using contraceptives and antiretroviral therapy in western Kenya: a retrospective, cohort study.BMC Med. 2021 Aug 13;19(1):178. doi: 10.1186/s12916-021-02043-z. BMC Med. 2021. PMID: 34384443 Free PMC article.
-
Single-agent tenofovir versus combination emtricitabine plus tenofovir for pre-exposure prophylaxis for HIV-1 acquisition: an update of data from a randomised, double-blind, phase 3 trial.Lancet Infect Dis. 2014 Nov;14(11):1055-1064. doi: 10.1016/S1473-3099(14)70937-5. Epub 2014 Oct 7. Lancet Infect Dis. 2014. PMID: 25300863 Free PMC article. Clinical Trial.
-
HIV protective efficacy and correlates of tenofovir blood concentrations in a clinical trial of PrEP for HIV prevention.J Acquir Immune Defic Syndr. 2014 Jul 1;66(3):340-8. doi: 10.1097/QAI.0000000000000172. J Acquir Immune Defic Syndr. 2014. PMID: 24784763 Free PMC article. Clinical Trial.
References
-
- Ballantyne CM, Hoogeveen RC, Bang H, et al. Lipoprotein-associated phospholipase A(2), high-sensitivity C-reactive protein, and risk for incident coronary heart disease in middle-aged men and women in the Atherosclerosis Risk in Communities (ARIC) study. Circulation. 2004;109:837–842. - PubMed
-
- Barlow WE. Robust variance estimation for the case-cohort design. Biometrics. 1994;50:1064–1072. - PubMed
-
- Barlow WE, Ichikawa L, Rosner D, Izumi S. Analysis of case-cohort designs. J Clin Epidemiol. 1999;52:1165–1172. - PubMed
-
- Begun JM, Hall WJ, Huang W-M, Wellner JA. Information and asymptotic efficiency in parametric-nonparametric models. Ann Stat. 1983;11:432–452.
-
- Binder DA. Fitting Cox’s proportional hazards model from survey data. Biometrika. 1992;79:139–147.
Grants and funding
- N01 HC055022/HC/NHLBI NIH HHS/United States
- N01 HC055019/HL/NHLBI NIH HHS/United States
- N01 HC055021/HL/NHLBI NIH HHS/United States
- R01 CA040644/CA/NCI NIH HHS/United States
- N01 HC055015/HC/NHLBI NIH HHS/United States
- N01 HC055016/HC/NHLBI NIH HHS/United States
- N01 HC055020/HL/NHLBI NIH HHS/United States
- N01 HC055019/HC/NHLBI NIH HHS/United States
- R01 CA054498/CA/NCI NIH HHS/United States
- N01 HC055021/HC/NHLBI NIH HHS/United States
- N01 HC055022/HL/NHLBI NIH HHS/United States
- N01 HC055020/HC/NHLBI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Miscellaneous