Missing data analysis using multiple imputation: getting to the heart of the matter
- PMID: 20123676
- PMCID: PMC2818781
- DOI: 10.1161/CIRCOUTCOMES.109.875658
Missing data analysis using multiple imputation: getting to the heart of the matter
Abstract
Missing data are a pervasive problem in health investigations. We describe some background of missing data analysis and criticize ad hoc methods that are prone to serious problems. We then focus on multiple imputation, in which missing cases are first filled in by several sets of plausible values to create multiple completed datasets, then standard complete-data procedures are applied to each completed dataset, and finally the multiple sets of results are combined to yield a single inference. We introduce the basic concepts and general methodology and provide some guidance for application. For illustration, we use a study assessing the effect of cardiovascular diseases on hospice discussion for late stage lung cancer patients.
Conflict of interest statement
None
Figures
References
-
- Ayanian JZ, Chrischilles EA, Fletcher RH, Fouad MN, Harrington DP. Understanding cancer treatment and outcomes: the Cancer Care Outcomes Research and Surveillance Consortium. Journal of Clinical Oncology. 2003;22:2292–2296. - PubMed
-
- Rubin DB. Multiple Imputation for Nonresponse in Surveys. New York: Wiley; 1987.
-
- Rubin DB. Inference and missing data (with discussion) Biometrika. 1976;63:581–592.
-
- Little RJA, Rubin DB. Statistical Analysis of Missing Data. 2. New York: Wiley; 2002.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources