Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jan;23(1):159-64.
doi: 10.1097/EDE.0b013e31823b6296.

Berkson's bias, selection bias, and missing data

Affiliations

Berkson's bias, selection bias, and missing data

Daniel Westreich. Epidemiology. 2012 Jan.

Abstract

Although Berkson's bias is widely recognized in the epidemiologic literature, it remains underappreciated as a model of both selection bias and bias due to missing data. Simple causal diagrams and 2 × 2 tables illustrate how Berkson's bias connects to collider bias and selection bias more generally, and show the strong analogies between Berksonian selection bias and bias due to missing data. In some situations, considerations of whether data are missing at random or missing not at random are less important than the causal structure of the missing data process. Although dealing with missing data always relies on strong assumptions about unobserved variables, the intuitions built with simple examples can provide a better understanding of approaches to missing data in real-world situations.

PubMed Disclaimer

Conflict of interest statement

CONFLICTS

No conflicts of interest.

Figures

Figure 1
Figure 1. An illustration of Berkson’s Bias
Figure 1A (left) shows a causal structure with an exposure E, an outcome D, and a factor C (clinic attendance) affected by both E and D. In Figure 1B, restricting to a level of C (C=1) leads to a non-causal association between E and D, represented with a dotted line.
Figure 2
Figure 2. Causal diagram for non-informative selection bias
Neither E nor D affects factor C, so conditioning on or restricting to a level of C amounts to simple random sampling.
Figure 3
Figure 3. Causal diagram for informative selection bias
E, but not D, affects factor C, so conditioning on or restricting to a level of C amounts to simple random sampling within level of E.
Figure 4
Figure 4. Causal diagram for informative selection bias
D, but not E, affects factor C, so conditioning on or restricting to a level of C amounts to simple random sampling within level of D.
Figure 5
Figure 5. Causal diagram for informative selection bias
E and D affect factor C, so conditioning on or restricting to a level of C amounts to simple random sampling within level of both E and D.

References

    1. Berkson J. Limitations of the Application of Fourfold Table Analysis to Hospital Data. Biometrics Bulletin. 1946;2(3):47–53. - PubMed
    1. Greenland S. Quantifying biases in causal models: classical confounding vs collider-stratification bias. Epidemiology. 2003;14(3):300–306. - PubMed
    1. Hernán MA, Hernandez-Diaz S, Robins JM. A structural approach to selection bias. Epidemiology. 2004;15(5):615–625. - PubMed
    1. Rubin DB. Inference and Missing Data. Biometrika. 1976;63:581–592.
    1. Little RJA, Rubin DB. Statistical Analysis with Missing Data. New York: John Wiley; 1987.

Publication types

Substances