Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Oct 22;11(1):64.
doi: 10.1186/s13073-019-0677-z.

NARD: whole-genome reference panel of 1779 Northeast Asians improves imputation accuracy of rare and low-frequency variants

Affiliations

NARD: whole-genome reference panel of 1779 Northeast Asians improves imputation accuracy of rare and low-frequency variants

Seong-Keun Yoo et al. Genome Med. .

Abstract

Here, we present the Northeast Asian Reference Database (NARD), including whole-genome sequencing data of 1779 individuals from Korea, Mongolia, Japan, China, and Hong Kong. NARD provides the genetic diversity of Korean (n = 850) and Mongolian (n = 384) ancestries that were not present in the 1000 Genomes Project Phase 3 (1KGP3). We combined and re-phased the genotypes from NARD and 1KGP3 to construct a union set of haplotypes. This approach established a robust imputation reference panel for Northeast Asians, which yields the greatest imputation accuracy of rare and low-frequency variants compared with the existing panels. NARD imputation panel is available at https://nard.macrogen.com/ .

Keywords: East Asians; Genotype imputation; Northeast Asians; Reference panel; Whole-genome sequencing.

PubMed Disclaimer

Conflict of interest statement

The authors affiliated with Precision Medicine Institute are full-time employees at Macrogen: S.-K.Y., C.-U.K., S.K., J.-Y.S., N.K., J.S.Y., C.K., and J.-S.S. The remaining authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Ancestry composition of 1779 individuals in the NARD. a PCA of global populations from the NARD and 1KGP3. AFR, AMR, EAS, EUR, and SAS denote Africans, Americans, East Asians, Europeans, and South Asians, respectively. b PCA of Northeast and Southeast Asians from the NARD and 1KGP3. Japanese in Tokyo from the 1KGP3 were combined into JPN. CHN from the NARD were categorized into CHB and CHS. c Population substructure of Northeast and Southeast Asians with five ancestral components inferred by ADMIXTURE algorithm
Fig. 2
Fig. 2
Imputation performance evaluation. a Imputation accuracy assessment using the five different reference panels. The pseudo-GWAS panel of 97 KOR was used for the imputation. The x-axis represents MAF of 850 KOR individuals from the NARD. The y-axis represents the aggregated R2 values of SNPs, which were calculated by the true genotypes and the imputed dosages. Only SNPs that were imputed across all panels were used for the aggregation of R2 values. b Number of imputed SNPs as a function of the estimated imputation accuracy and the types of imputation panel. This result was generated based on the R2 values that were estimated by Minimac3
Fig. 3
Fig. 3
Variant interpretation using the NARD. a MAF differences of SNPs shared between the NARD and gnomAD. The y-axis denotes the MAF of SNPs in worldwide populations (ALL) or EAS from the gnomAD. Color represents the MAF of SNPs in 1779 Northeast Asians from the NARD. b Number of uncommon (MAF < 5%) protein-altering variants (missense, nonsense, frameshift, and splicing variants) after filtration using the gnomAD with/without NARD. Variant catalogue from the gnomAD (exome) was applied. ***P < 0.0001 by two-tailed Mann-Whitney U test (compared with gnomAD-EAS + NARD)

References

    1. The 1000 Genomes Project Consortium A global reference for human genetic variation. Nature. 2015;526:68–74. doi: 10.1038/nature15393. - DOI - PMC - PubMed
    1. McCarthy S, Das S, Kretzschmar W, Delaneau O, Wood AR, Teumer A, Kang HM, Fuchsberger C, Danecek P, Sharp K, et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet. 2016;48:1279–1283. doi: 10.1038/ng.3643. - DOI - PMC - PubMed
    1. The Genome of the Netherlands Consortium Whole-genome sequence variation, population structure and demographic history of the Dutch population. Nat Genet. 2014;46:818–825. doi: 10.1038/ng.3021. - DOI - PubMed
    1. Gudbjartsson DF, Helgason H, Gudjonsson SA, Zink F, Oddson A, Gylfason A, Besenbacher S, Magnusson G, Halldorsson BV, Hjartarson E, et al. Large-scale whole-genome sequencing of the Icelandic population. Nat Genet. 2015;47:435–444. doi: 10.1038/ng.3247. - DOI - PubMed
    1. Huang J, Howie B, McCarthy S, Memari Y, Walter K, Min JL, Danecek P, Malerba G, Trabetti E, Zheng HF, et al. Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel. Nat Commun. 2015;6:8111. doi: 10.1038/ncomms9111. - DOI - PMC - PubMed

Publication types