Improving upon the efficiency of complete case analysis when covariates are MNAR

Jonathan W Bartlett¹, James R Carpenter², Kate Tilling³, Stijn Vansteelandt⁴

Affiliations

¹ Centre for Statistical Methodology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK jonathan.bartlett@lshtm.ac.uk.
² Centre for Statistical Methodology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK and MRC Clinical Trial Trials Unit, Kingsway, London WC2B 6NH, UK.
³ School of Social and Community Medicine, University of Bristol, Canynge Hall, 39 Whatley Road, Bristol BS8 2PS, UK.
⁴ Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Krijgslaan, 281 S9, B-9000 Ghent, Belgium.

PMID: 24907708
PMCID: PMC4173105
DOI: 10.1093/biostatistics/kxu023

Improving upon the efficiency of complete case analysis when covariates are MNAR

Jonathan W Bartlett et al. Biostatistics. 2014 Oct.

. 2014 Oct;15(4):719-30.

doi: 10.1093/biostatistics/kxu023. Epub 2014 Jun 6.

Authors

Jonathan W Bartlett¹, James R Carpenter², Kate Tilling³, Stijn Vansteelandt⁴

Affiliations

¹ Centre for Statistical Methodology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK jonathan.bartlett@lshtm.ac.uk.
² Centre for Statistical Methodology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK and MRC Clinical Trial Trials Unit, Kingsway, London WC2B 6NH, UK.
³ School of Social and Community Medicine, University of Bristol, Canynge Hall, 39 Whatley Road, Bristol BS8 2PS, UK.
⁴ Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Krijgslaan, 281 S9, B-9000 Ghent, Belgium.

PMID: 24907708
PMCID: PMC4173105
DOI: 10.1093/biostatistics/kxu023

Erratum in

Corrigendum: Improving upon the efficiency of complete case analysis when covariates are MNAR (10.1093/biostatistics/kxu023).
Bartlett JW, Carpenter JR, Tilling K, Vansteelandt S. Bartlett JW, et al. Biostatistics. 2015 Jan;16(1):205. doi: 10.1093/biostatistics/kxu051. Biostatistics. 2015. PMID: 25505287 Free PMC article. No abstract available.

Abstract

Missing values in covariates of regression models are a pervasive problem in empirical research. Popular approaches for analyzing partially observed datasets include complete case analysis (CCA), multiple imputation (MI), and inverse probability weighting (IPW). In the case of missing covariate values, these methods (as typically implemented) are valid under different missingness assumptions. In particular, CCA is valid under missing not at random (MNAR) mechanisms in which missingness in a covariate depends on the value of that covariate, but is conditionally independent of outcome. In this paper, we argue that in some settings such an assumption is more plausible than the missing at random assumption underpinning most implementations of MI and IPW. When the former assumption holds, although CCA gives consistent estimates, it does not make use of all observed information. We therefore propose an augmented CCA approach which makes the same conditional independence assumption for missingness as CCA, but which improves efficiency through specification of an additional model for the probability of missingness, given the fully observed variables. The new method is evaluated using simulations and illustrated through application to data on reported alcohol consumption and blood pressure from the US National Health and Nutrition Examination Survey, in which data are likely MNAR independent of outcome.

Keywords: Complete case analysis; Missing covariates; Missing not at random; Multiple imputation.

PubMed Disclaimer

References

1. Carpenter J. R., Kenward M. G., Vansteelandt S. A comparison of multiple imputation and inverse probability weighting for analyses with missing data. Journal of the Royal Statistical Society, Series A (Statistics in Society) 2006;169:571–584.
1. Little R. J. A., Rubin D. B. Statistical Analysis with Missing Data. Chichester: Wiley; 2002. 2nd edition.
1. Little R. J., Zhang N. Subsample ignorable likelihood for regression analysis with missing data. Journal of the Royal Statistical Society. 2011;60:591–605.
1. Newey W. K., McFadden D. Large sample estimation and hypothesis testing. In: Engle R. F., McFadden D. L., editors. Handbook of Econometrics. 1994. pp. 2111–2245. Elsevier B.V.:
1. Qi L., Wang C. Y., Prentice R. L. Weighted estimators for proportional hazards regression with missing covariates. Journal of the American Statistical Association. 2005;100:1250–1263.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Improving upon the efficiency of complete case analysis when covariates are MNAR

Affiliations

Improving upon the efficiency of complete case analysis when covariates are MNAR

Authors

Affiliations

Erratum in

Abstract

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Miscellaneous