Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 May 1;1(1):32.
doi: 10.1007/s12561-009-9001-6.

Improved Horvitz-Thompson Estimation of Model Parameters from Two-phase Stratified Samples: Applications in Epidemiology

Affiliations

Improved Horvitz-Thompson Estimation of Model Parameters from Two-phase Stratified Samples: Applications in Epidemiology

Norman E Breslow et al. Stat Biosci. .

Abstract

The case-cohort study involves two-phase sampling: simple random sampling from an infinite super-population at phase one and stratified random sampling from a finite cohort at phase two. Standard analyses of case-cohort data involve solution of inverse probability weighted (IPW) estimating equations, with weights determined by the known phase two sampling fractions. The variance of parameter estimates in (semi)parametric models, including the Cox model, is the sum of two terms: (i) the model based variance of the usual estimates that would be calculated if full data were available for the entire cohort; and (ii) the design based variance from IPW estimation of the unknown cohort total of the efficient influence function (IF) contributions. This second variance component may be reduced by adjusting the sampling weights, either by calibration to known cohort totals of auxiliary variables correlated with the IF contributions or by their estimation using these same auxiliary variables. Both adjustment methods are implemented in the R survey package. We derive the limit laws of coefficients estimated using adjusted weights. The asymptotic results suggest practical methods for construction of auxiliary variables that are evaluated by simulation of case-cohort samples from the National Wilms Tumor Study and by log-linear modeling of case-cohort data from the Atherosclerosis Risk in Communities Study. Although not semiparametric efficient, estimators based on adjusted weights may come close to achieving full efficiency within the class of augmented IPW estimators.

PubMed Disclaimer

Similar articles

Cited by

References

    1. Ballantyne CM, Hoogeveen RC, Bang H, et al. Lipoprotein-associated phospholipase A(2), high-sensitivity C-reactive protein, and risk for incident coronary heart disease in middle-aged men and women in the Atherosclerosis Risk in Communities (ARIC) study. Circulation. 2004;109:837–842. - PubMed
    1. Barlow WE. Robust variance estimation for the case-cohort design. Biometrics. 1994;50:1064–1072. - PubMed
    1. Barlow WE, Ichikawa L, Rosner D, Izumi S. Analysis of case-cohort designs. J Clin Epidemiol. 1999;52:1165–1172. - PubMed
    1. Begun JM, Hall WJ, Huang W-M, Wellner JA. Information and asymptotic efficiency in parametric-nonparametric models. Ann Stat. 1983;11:432–452.
    1. Binder DA. Fitting Cox’s proportional hazards model from survey data. Biometrika. 1992;79:139–147.

LinkOut - more resources