Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Dec 15;3 Suppl 7(Suppl 7):S5.
doi: 10.1186/1753-6561-3-s7-s5.

Assessment of genotype imputation methods

Affiliations

Assessment of genotype imputation methods

Joanna M Biernacka et al. BMC Proc. .

Abstract

Several methods have been proposed to impute genotypes at untyped markers using observed genotypes and genetic data from a reference panel. We used the Genetic Analysis Workshop 16 rheumatoid arthritis case-control dataset to compare the performance of four of these imputation methods: IMPUTE, MACH, PLINK, and fastPHASE. We compared the methods' imputation error rates and performance of association tests using the imputed data, in the context of imputing completely untyped markers as well as imputing missing genotypes to combine two datasets genotyped at different sets of markers. As expected, all methods performed better for single-nucleotide polymorphisms (SNPs) in high linkage disequilibrium with genotyped SNPs. However, MACH and IMPUTE generated lower imputation error rates than fastPHASE and PLINK. Association tests based on allele "dosage" from MACH and tests based on the posterior probabilities from IMPUTE provided results closest to those based on complete data. However, in both situations, none of the imputation-based tests provide the same level of evidence of association as the complete data at SNPs strongly associated with disease.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Imputation error rates decline with increasing LD (scenario 2).
Figure 2
Figure 2
Comparison of association test results (-log10(p-value)) based on complete data with tests based on imputed data under scenario 1 (imputation of untyped markers).
Figure 3
Figure 3
Association test results (-log10(p-value)) based on different imputation methods in the PTPN22 region under scenario 2 (imputation to combine two datasets).

References

    1. Servin B, Stephens M. Imputation-based analysis of association studies: candidate regions and quantitative traits. PLoS Genet. 2007;3:e114. doi: 10.1371/journal.pgen.0030114. - DOI - PMC - PubMed
    1. Marchini J, Howie B, Myers S, McVean G, Donnelly P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet. 2007;39:906–913. doi: 10.1038/ng2088. - DOI - PubMed
    1. Nicolae DL. Testing untyped alleles (TUNA)-applications to genome-wide association studies. Genet Epidemiol. 2006;30:718–727. doi: 10.1002/gepi.20182. - DOI - PubMed
    1. Li Y, Abecasis GR. Mach 1.0: rapid haplotype reconstruction and missing genotype inference [abstract 2290/C] Am J Hum Genet. 2006;S79:416.
    1. Nothnagel M, Ellinghaus D, Schreiber S, Krawczak M, Franke A. A comprehensive evaluation of SNP genotype imputation. Hum Genet. 2009;125:163–171. doi: 10.1007/s00439-008-0606-5. - DOI - PubMed