Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 May 27;6(22):eaaz7835.
doi: 10.1126/sciadv.aaz7835. eCollection 2020 May.

Korean Genome Project: 1094 Korean personal genomes with clinical information

Affiliations

Korean Genome Project: 1094 Korean personal genomes with clinical information

Sungwon Jeon et al. Sci Adv. .

Abstract

We present the initial phase of the Korean Genome Project (Korea1K), including 1094 whole genomes (sequenced at an average depth of 31×), along with data of 79 quantitative clinical traits. We identified 39 million single-nucleotide variants and indels of which half were singleton or doubleton and detected Korean-specific patterns based on several types of genomic variations. A genome-wide association study illustrated the power of whole-genome sequences for analyzing clinical traits, identifying nine more significant candidate alleles than previously reported from the same linkage disequilibrium blocks. Also, Korea1K, as a reference, showed better imputation accuracy for Koreans than the 1KGP panel. As proof of utility, germline variants in cancer samples could be filtered out more effectively when the Korea1K variome was used as a panel of normals compared to non-Korean variome sets. Overall, this study shows that Korea1K can be a useful genotypic and phenotypic resource for clinical and ethnogenetic studies.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1. Variants statistics and discovery rate of the novel variants.
(A) Number of variants in the Korea1K dataset in all autosomal regions categorized on the basis of allele frequencies (AFs). Singleton, allele count = 1; doubleton, allele count = 2; rare, allele count of >2 and allele frequency of ≤0.01; common, allele frequency of >0.01 and allele frequency of ≤0.05; and very common, allele frequency of >0.05. (B) The number of novel variants as a function of unrelated Korean genome samples.
Fig. 2
Fig. 2. Comparison with other populations.
Results of PCA of Korea1K and the 1KGP set of (A) worldwide populations and (B) East Asian samples. (C) The number of TE insertions with significantly different allele frequencies between the Korea1K set and the population. (D) The proportion of differential TE insertions. Colors indicate TE subtypes. Abbreviation for populations is same population code as 1KGP (ACB, African Caribbean; ASW, African Ancestry in Southwest USA; BEB, Bengali; CDX, Dai Chinese; CEU, Utah residents with Northern and Western European ancestry; CHB, Han Chinese; CHS, Southern Han Chinese; CLM, Colombian; ESN, Esan; FIN, Finnish; GBR, British; GIH, Gujarati; GWD, Gambian Mandinka; IBS, Iberian; ITU, Telugu; JPT, Japanese; KHV, Kinh Vietnamese; LWK, Luhya; MSL, Mende; MXL, Mexican Ancestry; PEL, Peruvian; PJL, Punjabi; PUR, Puerto Rican; STU, Tamil; TSI, Toscani; and YRI, Yoruba).
Fig. 3
Fig. 3. Manhattan plot of the reported loci via a GWAS.
Each color indicates a different clinical trait. The most significant reported markers in the loci are denoted with triangles. The dashed line indicates the threshold for genome-wide significance (7.5 × 10−9). The dotted line indicates the threshold for study-wide significance (9.5 × 10−11).
Fig. 4
Fig. 4. Imputation performance evaluation.
The x axis indicates alternative (Alt) allele frequency in the Korea1K set. The y axis represents the aggregated R2 values of SNVs. We used SNVs that were overlapped by imputed results across all panels.
Fig. 5
Fig. 5. Performance of the variant classification using different panels of normals.
(A) Accuracy (ACC) of classification. (B) Matthews correlation coefficient (MCC) values. (C) Germline recovery rate. The x axis indicates the used reference panel and allele frequency cutoff concatenated by the underscore symbol. EAS, SAS, AMR, EUR, and AFR indicate East Asian, South Asian, American, European, and African populations in 1KGP, respectively.

References

    1. Siska V., Jones E. R., Jeon S., Bhak Y., Kim H. M., Cho Y. S., Kim H., Lee K., Veselovskaya E., Balueva T., Gallego-Llorente M., Hofreiter M., Bradley D. G., Eriksson A., Pinhasi R., Bhak J., Manica A., Genome-wide data from two early Neolithic East Asian individuals dating to 7700 years ago. Sci. Adv. 3, e1601877 (2017). - PMC - PubMed
    1. HUGO Pan-Asian SNP Consortium, Abdulla M. A., Ahmed I., Assawamakin A., Bhak J., Brahmachari S. K., Calacal G. C., Chaurasia A., Chen C. H., Chen J., Chen Y. T., Chu J., Cutiongco-de la Paz E. M., De Ungria M. C., Delfin F. C., Edo J., Fuchareon S., Ghang H., Gojobori T., Han J., Ho S. F., Hoh B. P., Huang W., Inoko H., Jha P., Jinam T. A., Jin L., Jung J., Kangwanpong D., Kampuansai J., Kennedy G. C., Khurana P., Kim H. L., Kim K., Kim S., Kim W. Y., Kimm K., Kimura R., Koike T., Kulawonganunchai S., Kumar V., Lai P. S., Lee J. Y., Lee S., Liu E. T., Majumder P. P., Mandapati K. K., Marzuki S., Mitchell W., Mukerji M., Naritomi K., Ngamphiw C., Niikawa N., Nishida N., Oh B., Oh S., Ohashi J., Oka A., Ong R., Padilla C. D., Palittapongarnpim P., Perdigon H. B., Phipps M. E., Png E., Sakaki Y., Salvador J. M., Sandraling Y., Scaria V., Seielstad M., Sidek M. R., Sinha A., Srikummool M., Sudoyo H., Sugano S., Suryadi H., Suzuki Y., Tabbada K. A., Tan A., Tokunaga K., Tongsima S., Villamor L. P., Wang E., Wang Y., Wang H., Wu J. Y., Xiao H., Xu S., Yang J. O., Shugart Y. Y., Yoo H. S., Yuan W., Zhao G., Zilfalil B. A.; Indian Genome Variation Consortium , Mapping human genetic diversity in Asia. Science 326, 1541–1545 (2009). - PubMed
    1. R. O. K. M. F. Affairs, Total Number of Overseas Koreans (2017).
    1. Databank, Population Total (2018).
    1. Seo J.-S., Rhie A., Kim J., Lee S., Sohn M.-H., Kim C.-U., Hastie A., Cao H., Yun J.-Y., Kim J., Kuk J., Park G. H., Kim J., Ryu H., Kim J., Roh M., Baek J., Hunkapiller M. W., Korlach J., Shin J.-Y., Kim C., De novo assembly and phasing of a Korean human genome. Nature 538, 243–247 (2016). - PubMed

Publication types