Regression analysis with missing covariate data using estimating equations
- PMID: 8962448
Regression analysis with missing covariate data using estimating equations
Abstract
In regression analysis, missing covariate data has been among the most common problems. Frequently, practitioners adopt the so-called complete-case analysis, i.e., performing the analysis on only a complete dataset after excluding records with missing covariates. Performing a complete-case analysis is convenient with existing statistical packages, but it may be inefficient since the observed outcomes and covariates on those records with missing covariates are not used. It can even give misleading statistical inference if missing is not completely at random. This paper introduces a joint estimating equation (JEE) for regression analysis in the presence of missing observations on one covariate, which may be thought of as a method in a general framework for the missing covariate data problem proposed by Robins, Rotnitzky, and Zhao (1994, Journal of the American Statistical Association 89, 846-866). A generalization of JEE to more than one such covariate is discussed. The JEE is generally applicable to estimating regression coefficients from a regression model, including linear and logistic regression. Provided that the missing covariate data is either missing completely at random or missing at random (in addition to mild regularity conditions), estimates of regression coefficients from the JEE are consistent and have an asymptotic normal distribution. Simulation results show that the asymptotic distribution of estimated coefficients performs well in finite samples. Also shown through the simulation study is that the validity of JEE estimates depends on the correct specification of the probability function that characterizes the missing mechanism, suggesting a need for further research on how to robustify the estimation from making this nuisance assumption. Finally, the JEE is illustrated with an application from a case-control study of diet and thyroid cancer.
Similar articles
-
Inference using conditional logistic regression with missing covariates.Biometrics. 1998 Mar;54(1):295-303. Biometrics. 1998. PMID: 9544523
-
Generalized estimating equation model for binary outcomes with missing covariates.Biometrics. 1997 Dec;53(4):1458-66. Biometrics. 1997. PMID: 9423260
-
A weighted estimating equation for linear regression with missing covariate data.Stat Med. 2002 Aug 30;21(16):2421-36. doi: 10.1002/sim.1195. Stat Med. 2002. PMID: 12210626
-
A critical look at methods for handling missing covariates in epidemiologic regression analyses.Am J Epidemiol. 1995 Dec 15;142(12):1255-64. doi: 10.1093/oxfordjournals.aje.a117592. Am J Epidemiol. 1995. PMID: 7503045 Review.
-
Methods for observed-cluster inference when cluster size is informative: a review and clarifications.Biometrics. 2014 Jun;70(2):449-56. doi: 10.1111/biom.12151. Epub 2014 Jan 30. Biometrics. 2014. PMID: 24479899 Free PMC article. Review.
Cited by
-
Secondary outcome analysis for data from an outcome-dependent sampling design.Stat Med. 2018 Jul 10;37(15):2321-2337. doi: 10.1002/sim.7672. Epub 2018 Apr 22. Stat Med. 2018. PMID: 29682775 Free PMC article.
-
Mark-specific hazard ratio model with missing multivariate marks.Lifetime Data Anal. 2016 Oct;22(4):606-25. doi: 10.1007/s10985-015-9353-9. Epub 2015 Oct 28. Lifetime Data Anal. 2016. PMID: 26511033 Free PMC article.
-
A Likelihood-Based Approach for Missing Genotype Data.Hum Hered. 2010;69(3):171-83. doi: 10.1159/000273732. Hum Hered. 2010. PMID: 20068333 Free PMC article.
-
The impact of diet and betel nut use on skin lesions associated with drinking-water arsenic in Pabna, Bangladesh.Environ Health Perspect. 2006 Mar;114(3):334-40. doi: 10.1289/ehp.7916. Environ Health Perspect. 2006. PMID: 16507454 Free PMC article.
-
The Peripheral Blood Transcriptome Is Correlated With PET Measures of Lung Inflammation During Successful Tuberculosis Treatment.Front Immunol. 2021 Feb 10;11:596173. doi: 10.3389/fimmu.2020.596173. eCollection 2020. Front Immunol. 2021. PMID: 33643286 Free PMC article.