Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Feb;36(2):107-17.
doi: 10.1002/gepi.21603.

Genotype imputation of Metabochip SNPs using a study-specific reference panel of ~4,000 haplotypes in African Americans from the Women's Health Initiative

Affiliations

Genotype imputation of Metabochip SNPs using a study-specific reference panel of ~4,000 haplotypes in African Americans from the Women's Health Initiative

Eric Yi Liu et al. Genet Epidemiol. 2012 Feb.

Abstract

Genetic imputation has become standard practice in modern genetic studies. However, several important issues have not been adequately addressed including the utility of study-specific reference, performance in admixed populations, and quality for less common (minor allele frequency [MAF] 0.005-0.05) and rare (MAF < 0.005) variants. These issues only recently became addressable with genome-wide association studies (GWAS) follow-up studies using dense genotyping or sequencing in large samples of non-European individuals. In this work, we constructed a study-specific reference panel of 3,924 haplotypes using African Americans in the Women's Health Initiative (WHI) genotyped on both the Metabochip and the Affymetrix 6.0 GWAS platform. We used this reference panel to impute into 6,459 WHI SNP Health Association Resource (SHARe) study subjects with only GWAS genotypes. Our analysis confirmed the imputation quality metric Rsq (estimated r(2) , specific to each SNP) as an effective post-imputation filter. We recommend different Rsq thresholds for different MAF categories such that the average (across SNPs) Rsq is above the desired dosage r(2) (squared Pearson correlation between imputed and experimental genotypes). With a desired dosage r(2) of 80%, 99.9% (97.5%, 83.6%, 52.0%, 20.5%) of SNPs with MAF > 0.05 (0.03-0.05, 0.01-0.03, 0.005-0.01, and 0.001-0.005) passed the post-imputation filter. The average dosage r(2) for these SNPs is 94.7%, 92.1%, 89.0%, 83.1%, and 79.7%, respectively. These results suggest that for African Americans imputation of Metabochip SNPs from GWAS data, including low frequency SNPs with MAF 0.005-0.05, is feasible and worthwhile for power increase in downstream association analysis provided a sizable reference panel is available.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Reference construction and imputation pipeline using a study-specific reference panel. This schematic cartoon shows how we constructed our study-specific reference panel using five individuals genotyped on both the Affymetrix 6.0 and the Metabochip platform and how we performed imputation into the remaining five individuals with Affymetrix 6.0 data only.
Fig. 2
Fig. 2
Imputation accuracy by chromosome for 2% randomly masked GWAS SNPs. Imputation accuracy (as measured by average dosage r2) for 2% GWAS SNPs masked at random is plotted by chromosome.
Fig. 3
Fig. 3
Rsq by dosage r2 for 2% randomly masked GWAS SNPs. Estimated imputation accuracy (minimac output Rsq) is plotted against the true dosage r2, for the 2% GWAS SNPs masked at random.
Fig. 4
Fig. 4
Accuracy and calibration of imputation. Percentages of SNPs passing post-imputation QC (left y axis) and average dosage r2 (right y axis) are plotted against Rsq threshold used for post-imputation QC for SNPs in different MAF categories.
Fig. 5
Fig. 5
Rsq by dosage r2 Heatmap for Metabochip SNPs (estimated by masking 100 reference individuals). Estimated imputation accuracy (minimac output Rsq) is plotted against the true dosage r2, for Metabochip SNPs by masking 100 reference individuals. Color scheme is defined by the number of underlying SNPs, specifically, log10 (Frequency).

References

    1. Anderson G, Cummings S, Freedman LS, Furberg C, Henderson M, Johnson SR, Kuller L, Manson J, Oberman A, Prentice RL, Rossouw JE. Design of the women’s health initiative clinical trial and observational study. Controlled Clinical Trials. 1998;19(1):61–109. - PubMed
    1. Browning BL, Browning SR. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet. 2009;84:210–223. - PMC - PubMed
    1. Buyske S, Wu Y, Carty CL, Cheng I, Assimes TL, Dumitrescu L, Hindorff LA, Mitchell S, Ambite JL, Boerwinkle E, Buzkova P, Carlson CS, Cochran B, Duggan D, Eaton CB, Fesinmeyer MD, Franceschini N, Haessler J, Jenny N, Hyun Min Kang, Lin Y, Le Marchand L, Matise TC, Robinson JG, Rodriguez C, Schumacher FR, Voight BF, Young A, Manolio TA, Mohlke KL, Haiman CA, Peters U, Crawford DC, North KE. Evaluation of theMetabochip genotyping array in African Americans and implications for fine mapping of GWAS-ldentified loci: the PAGE Study. 2011 In preparation. - PMC - PubMed
    1. de Bakker PIW, Ferreira MAR, Jia XM, Neale BM, Raychaudhuri S, Voight BF. Practical aspects of imputation-driven meta--analysis of genome-wide association studies. Hum Mol Genet. 2008;17:R122–R128. - PMC - PubMed
    1. Egyud MR, Gajdos ZK, Butler JL, Tischfield S, Le Marchand L, Kolonel LN, Haiman CA, Henderson BE, Hirschhorn JN. Use of weighted reference panels based on empirical estimates of ancestry for capturing untyped variation. Hum Genet. 2009;125:295–303. - PMC - PubMed

Publication types