Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Multicenter Study
. 2018 Mar 1;187(3):568-575.
doi: 10.1093/aje/kwx348.

Principled Approaches to Missing Data in Epidemiologic Studies

Affiliations
Multicenter Study

Principled Approaches to Missing Data in Epidemiologic Studies

Neil J Perkins et al. Am J Epidemiol. .

Abstract

Principled methods with which to appropriately analyze missing data have long existed; however, broad implementation of these methods remains challenging. In this and 2 companion papers (Am J Epidemiol. 2018;187(3):576-584 and Am J Epidemiol. 2018;187(3):585-591), we discuss issues pertaining to missing data in the epidemiologic literature. We provide details regarding missing-data mechanisms and nomenclature and encourage the conduct of principled analyses through a detailed comparison of multiple imputation and inverse probability weighting. Data from the Collaborative Perinatal Project, a multisite US study conducted from 1959 to 1974, are used to create a masked data-analytical challenge with missing data induced by known mechanisms. We illustrate the deleterious effects of missing data with naive methods and show how principled methods can sometimes mitigate such effects. For example, when data were missing at random, naive methods showed a spurious protective effect of smoking on the risk of spontaneous abortion (odds ratio (OR) = 0.43, 95% confidence interval (CI): 0.19, 0.93), while implementation of principled methods multiple imputation (OR = 1.30, 95% CI: 0.95, 1.77) or augmented inverse probability weighting (OR = 1.40, 95% CI: 1.00, 1.97) provided estimates closer to the "true" full-data effect (OR = 1.31, 95% CI: 1.05, 1.64). We call for greater acknowledgement of and attention to missing data and for the broad use of principled missing-data methods in epidemiologic research.

PubMed Disclaimer

References

    1. Little RJ, D’Agostino R, Cohen ML, et al. . The prevention and treatment of missing data in clinical trials. N Engl J Med. 2012;367(14):1355–1360. - PMC - PubMed
    1. Eekhout I, de Boer RM, Twisk JW, et al. . Missing data: a systematic review of how they are reported and handled. Epidemiology. 2012;23(5):729–732. - PubMed
    1. Harel O, Boyko J. Mi??ing data: should we c?re? Am J Public Health. 2013;103(2):200–201. - PMC - PubMed
    1. Klebanoff MA, Cole SR. Use of multiple imputation in the epidemiologic literature. Am J Epidemiol. 2008;168(4):355–357. - PMC - PubMed
    1. Sterne JA, White IR, Carlin JB, et al. . Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009;338:b2393. - PMC - PubMed

Publication types