Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Sep 14:6:8111.
doi: 10.1038/ncomms9111.

Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

Collaborators, Affiliations

Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

Jie Huang et al. Nat Commun. .

Abstract

Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low depth (average 7x), aiming to exhaustively characterize genetic variation down to 0.1% minor allele frequency in the British population. Here we demonstrate the value of this resource for improving imputation accuracy at rare and low-frequency variants in both a UK and an Italian population. We show that large increases in imputation accuracy can be achieved by re-phasing WGS reference panels after initial genotype calling. We also present a method for combining WGS panels to improve variant coverage and downstream imputation accuracy, which we illustrate by integrating 7,562 WGS haplotypes from the UK10K project with 2,184 haplotypes from the 1000 Genomes Project. Finally, we introduce a novel approximation that maintains speed without sacrificing imputation accuracy for rare variants.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Imputation performance for different imputation strategies and reference panels.
(a) Imputation accuracy in the UK10K pseudo-GWAS test panel using reference panels from 1000GP (black) and UK10K (blue). The ‘original' UK10K reference panel (dotted blue line) was produced by standard genotype refinement of low-coverage sequencing data, whereas the ‘rephased' reference panel (solid blue line) was produced by running SHAPEIT v2 on the genotypes called by BEAGLE to improve haplotype accuracy. (b) Number of imputed variants in UK10K pseudo-GWAS panel as a function of predicted minor allele frequency in the study cohort (x-axis), expected imputation r2 (density of shading), and reference panel: 1000GP (black), UK10K (blue), or combined UK10K and 1000GP (red). Confidently imputed variants are shown in the bottom segment of each bar for easy comparison. Note that expected r2 tends to be larger than true r2. (c) As in b, but using the INCIPE cohort (representative of the general Italian population) as a pseudo-GWAS panel. (d) Imputation accuracy in the INCIPE pseudo-GWAS panel using the UK10K reference panel and different imputation approximations. Results are provided for a run that used all reference haplotypes with no approximation (blue solid line), a run that used an established Hamming distance approximation (orange solid line), and a run that used a new tract sharing approximation (orange dashed line).

References

    1. Howie B. N., Donnelly P. & Marchini J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009). - PMC - PubMed
    1. Frazer K. A. et al.. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007). - PMC - PubMed
    1. Abecasis G. R. et al.. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012). - PMC - PubMed
    1. Howie B., Fuchsberger C., Stephens M., Marchini J. & Abecasis G. R. Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44, 955–959 (2012). - PMC - PubMed
    1. Howie B., Marchini J. & Stephens M. Genotype imputation with thousands of genomes. G3 (Bethesda) 1, 457–470 (2011). - PMC - PubMed

Publication types