Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jun 1;169(11):1398-405.
doi: 10.1093/aje/kwp055. Epub 2009 Apr 8.

Using the whole cohort in the analysis of case-cohort data

Affiliations

Using the whole cohort in the analysis of case-cohort data

Norman E Breslow et al. Am J Epidemiol. .

Abstract

Case-cohort data analyses often ignore valuable information on cohort members not sampled as cases or controls. The Atherosclerosis Risk in Communities (ARIC) study investigators, for example, typically report data for just the 10%-15% of subjects sampled for substudies of their cohort of 15,972 participants. Remaining subjects contribute to stratified sampling weights only. Analysis methods implemented in the freely available R statistical system (http://cran.r-project.org/) make better use of the data through adjustment of the sampling weights via calibration or estimation. By reanalyzing data from an ARIC study of coronary heart disease and simulations based on data from the National Wilms Tumor Study, the authors demonstrate that such adjustment can dramatically improve the precision of hazard ratios estimated for baseline covariates known for all subjects. Adjustment can also improve precision for partially missing covariates, those known for substudy participants only, when their values may be imputed with reasonable accuracy for the remaining cohort members. Links are provided to software, data sets, and tutorials showing in detail the steps needed to carry out the adjusted analyses. Epidemiologists are encouraged to consider use of these methods to enhance the accuracy of results reported from case-cohort analyses.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Scatter plot and nonparametric regression curve showing predicted values of lipoprotein-phospholipase A2 (μg/L) plotted against measured values. Predicted values are based on weighted linear regression from phase 2 data (the Atherosclerosis Risk in Communities case-cohort study).

References

    1. Thomas DC. Addendum to: methods of cohort analysis: appraisal by application to asbestos mining by F.D.K. Liddell, J.C. McDonald and D.C. Thomas. J R Stat Soc (A) 1977;140(4):469–491.
    1. Cox DR. Regression models and life tables (with discussion) J R Stat Soc (B) 1972;34(2):187–220.
    1. Kupper LL, McMichael AJ, Spirtas R. Hybrid epidemiologic study design useful in estimating relative risk. J Am Stat Assoc. 1975;70(351):524–528.
    1. Miettinen O. Design options in epidemiologic research: an update. Scand J Work Environ Health. 1982;8(suppl 1):7–14. - PubMed
    1. Prentice RL. A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika. 1986;73(1):1–11.

Publication types