Correction of Selection Bias in Survey Data: Is the Statistical Cure Worse Than the Bias?
- PMID: 28272961
- PMCID: PMC5343703
- DOI: 10.2105/AJPH.2016.303644
Correction of Selection Bias in Survey Data: Is the Statistical Cure Worse Than the Bias?
Abstract
In previous articles in the American Journal of Epidemiology (Am J Epidemiol. 2013;177(5):431-442) and American Journal of Public Health (Am J Public Health. 2013;103(10):1895-1901), Masters et al. reported age-specific hazard ratios for the contrasts in mortality rates between obesity categories. They corrected the observed hazard ratios for selection bias caused by what they postulated was the nonrepresentativeness of the participants in the National Health Interview Study that increased with age, obesity, and ill health. However, it is possible that their regression approach to remove the alleged bias has not produced, and in general cannot produce, sensible hazard ratio estimates. First, we must consider how many nonparticipants there might have been in each category of obesity and of age at entry and how much higher the mortality rates would have to be in nonparticipants than in participants in these same categories. What plausible set of numerical values would convert the ("biased") decreasing-with-age hazard ratios seen in the data into the ("unbiased") increasing-with-age ratios that they computed? Can these values be encapsulated in (and can sensible values be recovered from) one additional internal variable in a regression model? Second, one must examine the age pattern of the hazard ratios that have been adjusted for selection. Without the correction, the hazard ratios are attenuated with increasing age. With it, the hazard ratios at older ages are considerably higher, but those at younger ages are well below one. Third, one must test whether the regression approach suggested by Masters et al. would correct the nonrepresentativeness that increased with age and ill health that I introduced into real and hypothetical data sets. I found that the approach did not recover the hazard ratio patterns present in the unselected data sets: the corrections overshot the target at older ages and undershot it at lower ages.
Figures

Comment in
-
Masters et al. Respond.Am J Public Health. 2017 Apr;107(4):505-506. doi: 10.2105/AJPH.2017.303715. Am J Public Health. 2017. PMID: 28272945 Free PMC article. No abstract available.
-
Editorial: Note About Inaccurate Results Published in the American Journal of Epidemiology and the American Journal of Public Health.Am J Public Health. 2017 Apr;107(4):502. doi: 10.2105/AJPH.2016.303643. Am J Public Health. 2017. PMID: 28272963 Free PMC article. No abstract available.
-
Editorial: Note About Inaccurate Results Published in the American Journal of Epidemiology and the American Journal of Public Health.Am J Epidemiol. 2017 Mar 15;185(6):407-408. doi: 10.1093/aje/kww176. Am J Epidemiol. 2017. PMID: 28399573 No abstract available.
-
Masters et al. Respond.Am J Epidemiol. 2017 Mar 15;185(6):412-413. doi: 10.1093/aje/kwx011. Am J Epidemiol. 2017. PMID: 28399574 Free PMC article. No abstract available.
Comment on
-
Obesity and US mortality risk over the adult life course.Am J Epidemiol. 2013 Mar 1;177(5):431-42. doi: 10.1093/aje/kws325. Epub 2013 Feb 3. Am J Epidemiol. 2013. PMID: 23380043 Free PMC article.
-
The impact of obesity on US mortality levels: the importance of age and cohort factors in population estimates.Am J Public Health. 2013 Oct;103(10):1895-901. doi: 10.2105/AJPH.2013.301379. Epub 2013 Aug 15. Am J Public Health. 2013. PMID: 23948004 Free PMC article.
References
-
- Kleinbaum DG, Kupper LL, Morgenstern H, editors. Epidemiologic Research: Principles and Quantitative Methods. Chapter 11. Belmont, CA: Lifetime Learning Publications; 1982. Chapter 11: selection bias; pp. 194–219.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources