On the performance of multiple imputation based on chained equations in tackling missing data of the African α3.7 -globin deletion in a malaria association study
- PMID: 24942080
- PMCID: PMC4140543
- DOI: 10.1111/ahg.12065
On the performance of multiple imputation based on chained equations in tackling missing data of the African α3.7 -globin deletion in a malaria association study
Abstract
Multiple imputation based on chained equations (MICE) is an alternative missing genotype method that can use genetic and nongenetic auxiliary data to inform the imputation process. Previously, MICE was successfully tested on strongly linked genetic data. We have now tested it on data of the HBA2 gene which, by the experimental design used in a malaria association study in Tanzania, shows a high missing data percentage and is weakly linked with the remaining genetic markers in the data set. We constructed different imputation models and studied their performance under different missing data conditions. Overall, MICE failed to accurately predict the true genotypes. However, using the best imputation model for the data, we obtained unbiased estimates for the genetic effects, and association signals of the HBA2 gene on malaria positivity. When the whole data set was analyzed with the same imputation model, the association signal increased from 0.80 to 2.70 before and after imputation, respectively. Conversely, postimputation estimates for the genetic effects remained the same in relation to the complete case analysis but showed increased precision. We argue that these postimputation estimates are reasonably unbiased, as a result of a good study design based on matching key socio-environmental factors.
Keywords: Genotype imputation; HBA2 gene; malaria positivity; multiple imputation based on chained equations.
© 2014 The Authors. Annals of Human Genetics published by John Wiley & Sons Ltd and University College London (UCL).
Figures

References
-
- Ambler G, Omar RZ. Royston P. A comparison of imputation techniques for handling missing predictor values in a risk model with a binary outcome. Stat Methods Med Res. 2007;16:277–298. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical