Comprehensive evaluation of imputation performance in African Americans

Pritam Chanda¹, Naoya Yuhki, Man Li, Joel S Bader, Alex Hartz, Eric Boerwinkle, W H Linda Kao, Dan E Arking

Affiliations

PMID: 22648186
PMCID: PMC3477509
DOI: 10.1038/jhg.2012.43

Comprehensive evaluation of imputation performance in African Americans

Pritam Chanda et al. J Hum Genet. 2012 Jul.

. 2012 Jul;57(7):411-21.

doi: 10.1038/jhg.2012.43. Epub 2012 May 31.

Authors

Pritam Chanda¹, Naoya Yuhki, Man Li, Joel S Bader, Alex Hartz, Eric Boerwinkle, W H Linda Kao, Dan E Arking

Affiliation

¹ Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA.

PMID: 22648186
PMCID: PMC3477509
DOI: 10.1038/jhg.2012.43

Abstract

Imputation of genome-wide single-nucleotide polymorphism (SNP) arrays to a larger known reference panel of SNPs has become a standard and an essential part of genome-wide association studies. However, little is known about the behavior of imputation in African Americans with respect to the different imputation algorithms, the reference population(s) and the reference SNP panels used. Genome-wide SNP data (Affymetrix 6.0) from 3207 African American samples in the Atherosclerosis Risk in Communities Study (ARIC) was used to systematically evaluate imputation quality and yield. Imputation was performed with the imputation algorithms MACH, IMPUTE and BEAGLE using several combinations of three reference panels of HapMap III (ASW, YRI and CEU) and 1000 Genomes Project (pilot 1 YRI June 2010 release, EUR and AFR August 2010 and June 2011 releases) panels with SNP data on chromosomes 18, 20 and 22. About 10% of the directly genotyped SNPs from each chromosome were masked, and SNPs common between the reference panels were used for evaluating the imputation quality using two statistical metrics-concordance accuracy and Cohen's kappa (κ) coefficient. The dependencies of these metrics on the minor allele frequencies (MAF) and specific genotype categories (minor allele homozygotes, heterozygotes and major allele homozygotes) were thoroughly investigated to determine the best panel and method for imputation in African Americans. In addition, the power to detect imputed SNPs associated with simulated phenotypes was studied using the mean genotype of each masked SNP in the imputed data. Our results indicate that the genotype concordances after stratification into each genotype category and Cohen's κ coefficient are considerably better equipped to differentiate imputation performance compared with the traditionally used total concordance statistic, and both statistics improved with increasing MAF irrespective of the imputation method. We also find that both MACH and IMPUTE performed equally well and consistently better than BEAGLE irrespective of the reference panel used. Of the various combinations of reference panels, for both HapMap III and 1000 Genomes Project reference panels, the multi-ethnic panels had better imputation accuracy than those containing only single ethnic samples. The most recent 1000 Genomes Project release June 2011 had substantially higher number of imputed SNPs than HapMap III and performed as well or better than the best combined HapMap III reference panels and previous releases of the 1000 Genomes Project.

PubMed Disclaimer

Figures

**Figure 1**
Distribution of concordance accuracy (CA) of minor allele homozygotes and heterozygotes of (a, b) MACH, (c, d) IMPUTE and (e, f) for BEAGLE. (M=MACH, I=IMPUTE and B=BEAGLE).

**Figure 2**
Distribution of kappa for (a) MACH, (c) IMPUTE and (e) BEAGLE. Power is shown in (b) for MACH, (d) for IMPUTE and (f) for BEAGLE. (M=MACH, I=IMPUTE and B=BEAGLE).

**Figure 3**
Kappa vs yield for the three algorithms with ASW + CEU + YRI III for minor allele frequencies (MAF) bins (a) ⩽ 0.05 (b) 0.05–0.1 (c) 0.1–0.3 and (d) 0.3–0.5. (M=MACH, I=IMPUTE and B=BEAGLE).

**Figure 4**
For panel ASW + CEU + YRI III, comparison of (a,b) mean concordance accuracy (CA) for minor allele homozygotes and heterozygotes and (c) mean kappa for each method at different minor allele frequencies (MAF) bins. (M=MACH, I=IMPUTE and B=BEAGLE).

**Figure 5**
With MACH (a–c) mean concordance accuracy (CA) for each genotype and (d) mean kappa using masked single-nucleotide polymorphisms (SNPs) exceeding a given ${\hat{r}}_{cutoff}^{2}$ for panel ASW + CEU + YRI III. Four minor allele frequencies (MAF) bins are shown as ⩽ 0.05 (red), 0.05–0.1 (blue), 0.1–0.3 (green) and 0.3–0.5 (magenta).

See this image and copyright information in PMC

References

1. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA. 2009;106:9362–9367. - PMC - PubMed
1. Consortium IH, Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, et al. A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007;449:851–861. - PMC - PubMed
1. Consortium IH, Altshuler DM, Gibbs RA, Peltonen L, Altshuler DM, Gibbs RA, et al. Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467:52–58. - PMC - PubMed
1. Consortium GP, Durbin RM, Abecasis GR, Altshuler DL, Auton A, Brooks LD, et al. A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–1073. - PMC - PubMed
1. Li Y, Willer C, Sanna S, Abecasis G. Genotype imputation. Annu. Rev. Genomics Hum. Genet. 2009;10:387–406. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Comprehensive evaluation of imputation performance in African Americans

Affiliation

Comprehensive evaluation of imputation performance in African Americans

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources