Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jun 26:11:99.
doi: 10.1186/1471-2288-11-99.

Stratified sampling design and loss to follow-up in survival models: evaluation of efficiency and bias

Affiliations

Stratified sampling design and loss to follow-up in survival models: evaluation of efficiency and bias

Cibele C César et al. BMC Med Res Methodol. .

Abstract

Background: Longitudinal studies often employ complex sample designs to optimize sample size, over-representing population groups of interest. The effect of sample design on parameter estimates is quite often ignored, particularly when fitting survival models. Another major problem in long-term cohort studies is the potential bias due to loss to follow-up.

Methods: In this paper we simulated a dataset with approximately 50,000 individuals as the target population and 15,000 participants to be followed up for 40 years, both based on real cohort studies of cardiovascular diseases. Two sample strategies--simple random (our golden standard) and Stratified by professional group, with non-proportional allocation--and two loss to follow-up scenarios--non-informative censoring and losses related to the professional group--were analyzed.

Results: Two modeling approaches were evaluated: weighted and non-weighted fit. Our results indicate that under the correctly specified model, ignoring the sample weights does not affect the results. However, the model ignoring the interaction of sample strata with the variable of interest and the crude estimates were highly biased.

Conclusions: In epidemiological studies misspecification should always be considered, as different sources of variability, related to the individuals and not captured by the covariates, are always present. Therefore, allowance must be made for the possibility of unknown confounders and interactions with the main variable of interest in our data. It is strongly recommended always to correct by sample weights.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Weibull distribution. Effect of changing the scale parameter on the time-to-event curve based on the simulated scenario.
Figure 2
Figure 2
Simulated Hazard Ratios under the Full Model. Correctly specified model returns exactly the same results independently of considering sample weights.
Figure 3
Figure 3
Simulated Hazard Ratios under Marginal Model. Large difference is observed for the hazards associated with smoking when fitting without sample weights, if the model does not include the interaction with professional category.
Figure 4
Figure 4
Simulated Hazard Ratios under Smoke-only Model. The pattern is similar to the Marginal model, with similar bias.
Figure 5
Figure 5
Simulated Hazard Ratios with loss to follow-up under Full Model. The upper frames show the random loss to follow-up and the lower ones the non-random censoring.
Figure 6
Figure 6
Simulated Hazard Ratios with loss to follow-up under Marginal Model. The upper frames, with the random loss to follow-up, show the bias for the smoking-hazard ratio for the non-weighted model. The lower frames with non-random censoring show the bias for all models.
Figure 7
Figure 7
Simulated Hazard Ratios with loss to follow-up under Smoke-only Model. The upper frame shows the bias for the non-weighted smoke-only model and the lower one the bias for all approaches due to non-random loss.
Figure 8
Figure 8
Average Variance of Estimates according to two Scenarios: without loss and with non-random loss to follow-up. The upper frame, without loss, shows smaller variance than the lower one and a similar pattern.
Figure 9
Figure 9
Mean Square Error according to two Scenarios: without loss and with non-random loss to follow-up. Both simulations, without loss (upper frame) and with loss (lower one), display a similar pattern, with the non-weighted model performing much worse.

Similar articles

Cited by

References

    1. Kalsbeek W, Heiss G. Building bridges between populations and samples in epidemiological studies. Annu Rev Public Health. 2000;21:147–169. doi: 10.1146/annurev.publhealth.21.1.147. http://dx.doi.org/10.1146/annurev.publhealth.21.1.147 - DOI - DOI - PubMed
    1. Xie Y. Otis Dudley Duncan's legacy: The demographic approach to quantitative reasoning in social science. Research in Social Stratification and Mobility. 2007;25:141–156. doi: 10.1016/j.rssm.2007.05.006. - DOI
    1. DuMouchel W, Duncan G. Using sample survey weights in multiple regression analysis of stratified samples. Journal of the American Statistical Association. 1983;78:535–548. doi: 10.2307/2288115. - DOI
    1. Kish L. Survey Sampling. John Wiley; 1965.
    1. Lawless J. Censoring and weighting in survival estimation from survey data. Proceedings of the Survey Mehods Section, Statistical Society of Canada 2003 Annual Meeting, Statistical Society of Canada. 2003.

Publication types

LinkOut - more resources