Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Dec 1;187(12):2705-2715.
doi: 10.1093/aje/kwy173.

Canonical Causal Diagrams to Guide the Treatment of Missing Data in Epidemiologic Studies

Affiliations

Canonical Causal Diagrams to Guide the Treatment of Missing Data in Epidemiologic Studies

Margarita Moreno-Betancur et al. Am J Epidemiol. .

Erratum in

Abstract

With incomplete data, the "missing at random" (MAR) assumption is widely understood to enable unbiased estimation with appropriate methods. While the need to assess the plausibility of MAR and to perform sensitivity analyses considering "missing not at random" (MNAR) scenarios has been emphasized, the practical difficulty of these tasks is rarely acknowledged. With multivariable missingness, what MAR means is difficult to grasp, and in many MNAR scenarios unbiased estimation is possible using methods commonly associated with MAR. Directed acyclic graphs (DAGs) have been proposed as an alternative framework for specifying practically accessible assumptions beyond the MAR-MNAR dichotomy. However, there is currently no general algorithm for deciding how to handle the missing data given a specific DAG. Here we construct "canonical" DAGs capturing typical missingness mechanisms in epidemiologic studies with incomplete data on exposure, outcome, and confounding factors. For each DAG, we determine whether common target parameters are "recoverable," meaning that they can be expressed as functions of the available data distribution and thus estimated consistently, or whether sensitivity analyses are necessary. We investigate the performance of available-case and multiple-imputation procedures. Using data from waves 1-3 of the Longitudinal Study of Australian Children (2004-2008), we illustrate how our findings can guide the treatment of missing data in point-exposure studies.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Canonical complete-data directed acyclic graph (c-DAG) for a general point-exposure study. For illustration, we provide under each node heading the variables involved in an example study of maternal mental illness and child behavior that used data from waves 1–3 of the Longitudinal Study of Australian Children (2004–2008). SDQ, Strengths and Difficulties Questionnaire.
Figure 2.
Figure 2.
Canonical missingness directed acyclic graphs (m-DAGs) for a general point-exposure study. These 10 m-DAGs were identified as providing the most general forms of all essentially distinct extensions of the m-DAG shown in panel A (referred to as “m-DAG A”) in terms of recoverability. To illustrate how each m-DAG extends m-DAG A, the additional arrows are indicated with a heavier line. In the text and tables, we refer to each m-DAG according to its figure locant (m-DAG A, m-DAG B, etc.).
Figure 3.
Figure 3.
Assessment of the existence of an arrow from each incomplete variable to each missingness indicator in the example from the Longitudinal Study of Australian Children (2004–2008), drawing from evidence in the literature (, –45). SDQ, Strengths and Difficulties Questionnaire.

References

    1. van Buuren S, Groothuis-Oudshoorn K. Mice: multivariate imputation by chained equations in R. J Stat Softw. 2011;45(3):1–67.
    1. Schafer JL. Analysis of Incomplete Multivariate Data. London, United Kingdom: Chapman & Hall Ltd.; 1997.
    1. Sterne JA, White IR, Carlin JB, et al. . Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009;338:b2393. - PMC - PubMed
    1. Seaman S, Galati J, Jackson D, et al. . What is meant by “missing at random”? Stat Sci. 2013;28(2):257–268.
    1. Mealli F, Rubin DB. Clarifying missing at random and related definitions, and implications when coupled with exchangeability. Biometrika. 2015;102(4):995–1000.

Publication types