Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb 1;47(1):226-235.
doi: 10.1093/ije/dyx206.

Collider scope: when selection bias can substantially influence observed associations

Affiliations

Collider scope: when selection bias can substantially influence observed associations

Marcus R Munafò et al. Int J Epidemiol. .

Abstract

Large-scale cross-sectional and cohort studies have transformed our understanding of the genetic and environmental determinants of health outcomes. However, the representativeness of these samples may be limited-either through selection into studies, or by attrition from studies over time. Here we explore the potential impact of this selection bias on results obtained from these studies, from the perspective that this amounts to conditioning on a collider (i.e. a form of collider bias). Whereas it is acknowledged that selection bias will have a strong effect on representativeness and prevalence estimates, it is often assumed that it should not have a strong impact on estimates of associations. We argue that because selection can induce collider bias (which occurs when two variables independently influence a third variable, and that third variable is conditioned upon), selection can lead to substantially biased estimates of associations. In particular, selection related to phenotypes can bias associations with genetic variants associated with those phenotypes. In simulations, we show that even modest influences on selection into, or attrition from, a study can generate biased and potentially misleading estimates of both phenotypic and genotypic associations. Our results highlight the value of knowing which population your study sample is representative of. If the factors influencing selection and attrition are known, they can be adjusted for. For example, having DNA available on most participants in a birth cohort study offers the possibility of investigating the extent to which polygenic scores predict subsequent participation, which in turn would enable sensitivity analyses of the extent to which bias might distort estimates.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Illustration of collider bias. The basic premise of collider bias is shown. In this example, a bell is sounded whenever either coin come up ‘heads’. The result of one coin toss is independent of the other. However, if we hear the bell ring (i.e. we condition on the bell ringing), then if we see a tail on one coin we know there must be a head on the other–the two coin results are no longer independent and a spurious inverse correlation has been induced. Reproduced from Gage SH, Davey Smith G, Ware JJ, Flint J, Munafò MR. G = E: What GWAS can tell us about the environment. PLoS Genet 2016;12: e1005765.
Figure 2
Figure 2
Illustration of selection bias simulation. In the intended study population there is no association between allele score and outcome. Selection into the study (either through voluntary participation at baseline, or attrition over time) induces an association between allele score and outcome (collider bias).
Figure 3
Figure 3
Scenarios where selection bias would occur. A. In truth, the SNP is not causally associated with the outcome; selection will induce an association (which could be positive or negative). B. In truth, the SNP is not causally associated with the outcome; selection will induce an association (which could be positive or negative). C. In truth, the SNP is causally associated with the outcome; selection could make this larger or attenuate it. D. In truth, the SNP is causally associated with the outcome; selection could make this larger or attenuate it. E. In truth, the SNP is causally associated with the outcome; selection will bias this association (which could be positive or negative). F. Note that the association between P and O is biased in the selected sample; however, the association between SNP and O is unbiased in the selected sample. P, Phenotype; O, Outcome; S, Selection; U, Other variables.

Similar articles

Cited by

References

    1. Lee JJ. Correlation and causation in the study of personality. Eur J Pers 2012;26:372–90.
    1. Relton CL, Gaunt T, McArdle W, et al.Data Resource Profile: Accessible Resource for Integrated Epigenomic Studies (ARIES). Int J Epidemiol 2015;44:1181–90. - PMC - PubMed
    1. Martin J, Tilling K, Hubbard L, et al.Association of genetic risk for schizophrenia with nonparticipation over time in a population-based cohort study. Am J Epidemiol 2016;183:1149–58. - PMC - PubMed
    1. Howe LD, Tilling K, Galobardes B, Lawlor DA. Loss to follow-up in cohort studies: bias in estimates of socioeconomic inequalities . Epidemiology 2013;24:1–9. - PMC - PubMed
    1. Elwood JM. Commentary: On representativeness. Int J Epidemiol 2013;42:1014–15. - PubMed

Publication types