Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Dec 30;39(30):4922-4948.
doi: 10.1002/sim.8741. Epub 2020 Sep 23.

Formulating causal questions and principled statistical answers

Affiliations

Formulating causal questions and principled statistical answers

Els Goetghebeur et al. Stat Med. .

Abstract

Although review papers on causal inference methods are now available, there is a lack of introductory overviews on what they can render and on the guiding criteria for choosing one particular method. This tutorial gives an overview in situations where an exposure of interest is set at a chosen baseline ("point exposure") and the target outcome arises at a later time point. We first phrase relevant causal questions and make a case for being specific about the possible exposure levels involved and the populations for which the question is relevant. Using the potential outcomes framework, we describe principled definitions of causal effects and of estimation approaches classified according to whether they invoke the no unmeasured confounding assumption (including outcome regression and propensity score-based methods) or an instrumental variable with added assumptions. We mainly focus on continuous outcomes and causal average treatment effects. We discuss interpretation, challenges, and potential pitfalls and illustrate application using a "simulation learner," that mimics the effect of various breastfeeding interventions on a child's later development. This involves a typical simulation component with generated exposure, covariate, and outcome data inspired by a randomized intervention study. The simulation learner further generates various (linked) exposure types with a set of possible values per observation unit, from which observed as well as potential outcome data are generated. It thus provides true values of several causal effects. R code for data generation and analysis is available on www.ofcaus.org, where SAS and Stata code for analysis is also provided.

Keywords: causation; instrumental variable; inverse probability weighting; matching; potential outcomes; propensity score.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
Data generating model for the simulation learner. BEP, breastfeeding encouragement program; BF, breastfeeding; m, months
FIGURE 2
FIGURE 2
DAG representing the setting for an IV analysis. A, treatment; U, unmeasured confounders; Y, outcome; Z, instrument

References

    1. Pearl J. Causal diagrams for empirical research. Biometrika. 1995;82(4):669‐688.
    1. Robins JM, Hernán MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000;11:550‐560. - PubMed
    1. Hernán MA, Brumback B, Robins JM. Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV‐positive men. Epidemiology. 2000;11:561‐570. - PubMed
    1. Kramer MS, Chalmers B, Hodnett ED, et al. Promotion of breastfeeding intervention trial (PROBIT) ‐ a randomized trial in the Republic of Belarus. J Am Med Assoc. 2001;285(4):413‐420. - PubMed
    1. github Formulating‐causal‐questions. 2020. https://github.com/IngWae/Formulating‐causal‐questions. - PMC - PubMed

Publication types

LinkOut - more resources