Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jul;78(4):277-89.
doi: 10.1111/ahg.12065.

On the performance of multiple imputation based on chained equations in tackling missing data of the African α3.7 -globin deletion in a malaria association study

Affiliations
Free PMC article

On the performance of multiple imputation based on chained equations in tackling missing data of the African α3.7 -globin deletion in a malaria association study

Nuno Sepúlveda et al. Ann Hum Genet. 2014 Jul.
Free PMC article

Abstract

Multiple imputation based on chained equations (MICE) is an alternative missing genotype method that can use genetic and nongenetic auxiliary data to inform the imputation process. Previously, MICE was successfully tested on strongly linked genetic data. We have now tested it on data of the HBA2 gene which, by the experimental design used in a malaria association study in Tanzania, shows a high missing data percentage and is weakly linked with the remaining genetic markers in the data set. We constructed different imputation models and studied their performance under different missing data conditions. Overall, MICE failed to accurately predict the true genotypes. However, using the best imputation model for the data, we obtained unbiased estimates for the genetic effects, and association signals of the HBA2 gene on malaria positivity. When the whole data set was analyzed with the same imputation model, the association signal increased from 0.80 to 2.70 before and after imputation, respectively. Conversely, postimputation estimates for the genetic effects remained the same in relation to the complete case analysis but showed increased precision. We argue that these postimputation estimates are reasonably unbiased, as a result of a good study design based on matching key socio-environmental factors.

Keywords: Genotype imputation; HBA2 gene; malaria positivity; multiple imputation based on chained equations.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(A) Testing a missing completely at random (MCAR) hypothesis under the basic assumption of the missing at random (MAR) model for the data resulting from the cross-tabulation of the number of α3.7-globin deletions with malaria parasite positivity. Each dot represents the p-value for the corresponding likelihood ratio test. Horizontal pointed line refers to the 5% significance level. In this analysis, we accepted the MCAR hypothesis on data from villages where p-value >0.05. The rejection of MCAR led to the acceptance of an MAR mechanism. (B) Association analysis between α3.7-globin deletions and different variables (phenotypes – at the left, SNPs – at the centre, and socioenvironmental factors – at the right) using complete data. Association signal is expressed in terms of −log10(p-value) for the corresponding association test: χ2 test for categorical explanatory variables (SNPs, low Hb, anemia, parasite positivity, gender, transect, village, and ethnicity) and score tests for quantitative explanatory variables (Hb levels, parasite density, age, and altitude) using a three-category logistic regression framework. Horizontal dashed line refers to −log10(0.001) corresponding to a 0.1% significance level.

References

    1. Ambler G, Omar RZ. Royston P. A comparison of imputation techniques for handling missing predictor values in a risk model with a binary outcome. Stat Methods Med Res. 2007;16:277–298. - PubMed
    1. Browning SR. Multilocus association mapping using variable-length Markov chains. Am J Hum Genet. 2006;78:903–913. - PMC - PubMed
    1. Browning BL. Browning SR. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet. 2009;84:210–223. - PMC - PubMed
    1. Campbell MC. Tishkoff SA. African genetic diversity: Implications for human demographic history, modern human origins, and complex disease mapping. Annu Rev Genomics Hum Genet. 2008;9:403–433. - PMC - PubMed
    1. Chanda P, Yuhki N, Li M, Bader JS, Hartz A, Boerwinkle E, Kao WH. Arking DE. Comprehensive evaluation of imputation performance in African Americans. J Hum Genet. 2012;57:411–421. - PMC - PubMed

Publication types