A Likelihood-Based Approach for Missing Genotype Data
- PMID: 20068333
- PMCID: PMC7077088
- DOI: 10.1159/000273732
A Likelihood-Based Approach for Missing Genotype Data
Abstract
Missing genotype data in a candidate gene association study can make it difficult to model the effects of multiple genetic variants simultaneously. In particular, when regression models are used to model phenotype as a function of SNP genotypes in several different genes, the most common approach is a complete case analysis, in which only individuals with no missing genotypes are included. But this can lead to substantial reduction in sample size and thus potential bias and loss in efficiency. A number of other methods for handling missing data are applicable, but have rarely been used in this context. The purpose of this paper is to describe how several standard methods for handling missing data can be applied or adapted to this problem, and to compare their performance using a simulation study. We demonstrate these techniques using an Alzheimer's disease association study. We show that the expectation-maximization algorithm and multiple imputation with a bootstrapped expectation-maximization sampling algorithm have the best properties of all the estimators studied.
Similar articles
-
High-dimensional, outcome-dependent missing data problems: Models for the human loci.Stat Methods Med Res. 2025 Mar;34(3):440-456. doi: 10.1177/09622802241304112. Epub 2025 Jan 31. Stat Methods Med Res. 2025. PMID: 39885761 Free PMC article.
-
Estimating haplotype frequencies and standard errors for multiple single nucleotide polymorphisms.Biostatistics. 2003 Oct;4(4):513-22. doi: 10.1093/biostatistics/4.4.513. Biostatistics. 2003. PMID: 14557108
-
Multiple imputation of missing genotype data for unrelated individuals.Ann Hum Genet. 2006 May;70(Pt 3):372-81. doi: 10.1111/j.1529-8817.2005.00236.x. Ann Hum Genet. 2006. PMID: 16674559
-
Missing data imputation and haplotype phase inference for genome-wide association studies.Hum Genet. 2008 Dec;124(5):439-50. doi: 10.1007/s00439-008-0568-7. Epub 2008 Oct 11. Hum Genet. 2008. PMID: 18850115 Free PMC article. Review.
-
Two-stage strategy using denoising autoencoders for robust reference-free genotype imputation with missing input genotypes.J Hum Genet. 2024 Oct;69(10):511-518. doi: 10.1038/s10038-024-01261-6. Epub 2024 Jun 25. J Hum Genet. 2024. PMID: 38918526 Free PMC article. Review.
Cited by
-
Missing Data Methods for Partial Correlations.J Biom Biostat. 2012 Dec;3(8):155. doi: 10.4172/2155-6180.1000155. J Biom Biostat. 2012. PMID: 24040575 Free PMC article.
-
Individual-based landscape genomics for conservation: An analysis pipeline.Mol Ecol Resour. 2025 Jul;25(5):e13884. doi: 10.1111/1755-0998.13884. Epub 2023 Oct 26. Mol Ecol Resour. 2025. PMID: 37883295 Free PMC article.
References
-
- Kamboh MI, Minster RL, Feingold E, DeKosky ST. Genetic association of ubiquilin with Alzheimer's disease and related quantitative measures. Mol Psychiatr. 2006;11:273–279. - PubMed
-
- Ashley-Koch AE, Elliott L, Kail ME, De Castro LM, Jonassaint J, Jackson TL, Price J, Ataga KI, Levesque MC, Weinberg JB, Orringer EP, Collins A, Vance JM, Telen MJ. Identification of genetic polymorphisms associated with risk for pulmonary hypertension in sickle cell disease. Blood. 2008;111:5721–5726. - PMC - PubMed
-
- Nielsen DA, Barral S, Proudnikov D, Kellogg S, Ho A, Ott J, Kreek MJ. TPH2 and TPH1: association of variants and interactions with heroin addiction. Behav Genet. 2008;38:133–150. - PubMed
-
- Smits KM, Smits LJM, Peeters FPML, Schouten JSAG, Janssen RGJH, Smeets HJM, van Os J, Prins MH. The influence of 5-HTTLPR and STin2 polymorphisms in the serotonin transporter gene on treatment effect of selective serotonin reuptake inhibitors in depressive patients. Psychiatr Genet. 2008;18:184–190. - PubMed
-
- Marchini J, Howie B, Myers S, McVean G, Donnelly P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet. 2007;39:906–913. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources