Canonical Causal Diagrams to Guide the Treatment of Missing Data in Epidemiologic Studies
- PMID: 30124749
- PMCID: PMC6269242
- DOI: 10.1093/aje/kwy173
Canonical Causal Diagrams to Guide the Treatment of Missing Data in Epidemiologic Studies
Erratum in
-
Correction to: "Canonical causal diagrams to guide the treatment of missing data in epidemiologic studies".Am J Epidemiol. 2025 Mar 4;194(3):877-880. doi: 10.1093/aje/kwae406. Am J Epidemiol. 2025. PMID: 39825497 Free PMC article. No abstract available.
Abstract
With incomplete data, the "missing at random" (MAR) assumption is widely understood to enable unbiased estimation with appropriate methods. While the need to assess the plausibility of MAR and to perform sensitivity analyses considering "missing not at random" (MNAR) scenarios has been emphasized, the practical difficulty of these tasks is rarely acknowledged. With multivariable missingness, what MAR means is difficult to grasp, and in many MNAR scenarios unbiased estimation is possible using methods commonly associated with MAR. Directed acyclic graphs (DAGs) have been proposed as an alternative framework for specifying practically accessible assumptions beyond the MAR-MNAR dichotomy. However, there is currently no general algorithm for deciding how to handle the missing data given a specific DAG. Here we construct "canonical" DAGs capturing typical missingness mechanisms in epidemiologic studies with incomplete data on exposure, outcome, and confounding factors. For each DAG, we determine whether common target parameters are "recoverable," meaning that they can be expressed as functions of the available data distribution and thus estimated consistently, or whether sensitivity analyses are necessary. We investigate the performance of available-case and multiple-imputation procedures. Using data from waves 1-3 of the Longitudinal Study of Australian Children (2004-2008), we illustrate how our findings can guide the treatment of missing data in point-exposure studies.
Figures
References
-
- van Buuren S, Groothuis-Oudshoorn K. Mice: multivariate imputation by chained equations in R. J Stat Softw. 2011;45(3):1–67.
-
- Schafer JL. Analysis of Incomplete Multivariate Data. London, United Kingdom: Chapman & Hall Ltd.; 1997.
-
- Seaman S, Galati J, Jackson D, et al. . What is meant by “missing at random”? Stat Sci. 2013;28(2):257–268.
-
- Mealli F, Rubin DB. Clarifying missing at random and related definitions, and implications when coupled with exchangeability. Biometrika. 2015;102(4):995–1000.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous
