This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2025 Jan 7:2024.05.17.24307550.

doi: 10.1101/2024.05.17.24307550.

Genome-wide association studies in a large Korean cohort identify novel quantitative trait loci for 36 traits and illuminate their genetic architectures

Yon Ho Jee¹, Ying Wang^{2

3}, Keum Ji Jung⁴, Ji-Young Lee⁴, Heejin Kimm⁴, Rui Duan⁵, Alkes L Price^{1

5

6}, Alicia R Martin^{2

3

7}, Peter Kraft^{1

8}

Affiliations

¹ Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
² Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA.
³ Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
⁴ Institute for Health Promotion, Department of Epidemiology and Health Promotion, Graduate School of Public Health, Yonsei University, Seoul, Korea.
⁵ Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.
⁶ Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
⁷ Department of Medicine, Harvard Medical School, Boston, MA, USA.
⁸ Transdivisional Research Program, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, MD, USA.

PMID: 38798434
PMCID: PMC11118625
DOI: 10.1101/2024.05.17.24307550

Genome-wide association studies in a large Korean cohort identify novel quantitative trait loci for 36 traits and illuminate their genetic architectures

Yon Ho Jee et al. medRxiv. 2025.

[Preprint]. 2025 Jan 7:2024.05.17.24307550.

doi: 10.1101/2024.05.17.24307550.

Authors

Yon Ho Jee¹, Ying Wang^{2

3}, Keum Ji Jung⁴, Ji-Young Lee⁴, Heejin Kimm⁴, Rui Duan⁵, Alkes L Price^{1

5

6}, Alicia R Martin^{2

3

7}, Peter Kraft^{1

8}

Affiliations

¹ Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
² Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA.
³ Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
⁴ Institute for Health Promotion, Department of Epidemiology and Health Promotion, Graduate School of Public Health, Yonsei University, Seoul, Korea.
⁵ Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.
⁶ Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
⁷ Department of Medicine, Harvard Medical School, Boston, MA, USA.
⁸ Transdivisional Research Program, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, MD, USA.

PMID: 38798434
PMCID: PMC11118625
DOI: 10.1101/2024.05.17.24307550

Update in

Genome-wide association studies in a large Korean cohort identify quantitative trait loci for 36 traits and illuminate their genetic architectures.
Jee YH, Wang Y, Jung KJ, Lee JY, Kimm H, Duan R, Price AL, Martin AR, Kraft P. Jee YH, et al. Nat Commun. 2025 May 28;16(1):4935. doi: 10.1038/s41467-025-59950-5. Nat Commun. 2025. PMID: 40436827 Free PMC article.

Abstract

Genome-wide association studies (GWAS) have been predominantly conducted in populations of European ancestry, limiting opportunities for biological discovery in diverse populations. We report GWAS findings from 153,950 individuals across 36 quantitative traits in the Korean Cancer Prevention Study-II (KCPS2) Biobank. We discovered 301 novel genetic loci in KCPS2, including an association between thyroid-stimulating hormone and CD36. Meta-analysis with the Korean Genome and Epidemiology Study, Biobank Japan, Taiwan Biobank, and UK Biobank identified 4,588 loci that were not significant in any contributing GWAS. We describe differences in genetic architectures across these East Asian and European samples. We also highlight East Asian specific associations, including a known pleiotropic missense variant in ALDH2, which fine-mapping identified as a likely causal variant for a diverse set of traits. Our findings provide insights into the genetic architecture of complex traits in East Asian populations and highlight how broadening the population diversity of GWAS samples can aid discovery.

Keywords: Korean population; complex traits; genetic architecture; genome-wide association study.

PubMed Disclaimer

Figures

**Figure 1 |. Overview of the Korean Cancer Prevention Study-II Biobank and analysis.**
Detailed descriptions of the 36 quantitative traits examined in this study are shown in Table S1. After QC, the data were phased using SHAPEIT4 and imputed using IMPUTE5 with 1000 Genomes Project Phase 3 data.

**Figure 2 |. GWAS results for 36 quantitative traits in the Korean Cancer Prevention Biobank-II (KCPS2).**
**(a)** Number of known and novel variants identified in KCPS2 compared to the Open Target Genetics using EFO terms (Table S2–S3). **(b)** A summary of genome-wide significant loci associated with the 36 traits in KCPS2. Each locus was mapped to a gene using FUMA with a 1000 Genome Phase 3 East Asian reference panel. We then counted the number of associated traits (out of 36 traits) per gene (Table S4). **(c)** Comparisons of pairwise genetic correlations (rg) between phenotypic correlations (rp) for the 36 traits in KCPS2. rg was estimated using bivariate LDSC based on association test statistics from linear regression. Significant rg and rp after false discovery rate (FDR<0.05) correction is indicated by purple if both rg and rp were significant, red if only rg was significant, blue if only rp was significant, and gray if neither was significant. The black solid line was estimated by spline smoothing from a linear regression model. The complete set of rg and rp is available in Table S5.

**Figure 3 |. Meta-analysis of 21 traits across KCPS2, KoGES, BBJ, TWB, and UKB.**
**(a)** Genome-wide significant loci identified in the meta-analysis, Color of dots indicate significance in meta-analysis (black), KCPS2 (blue), KoGES (orange), BBJ (purple), TWB (light blue), and UKB (green). Multiple dots in a bar represent simultaneous significance in multiple cohorts. **(b)** Comparisons of allele frequency and effect sizes in KCPS2 for the genome-wide significant variants discovered only in KCPS2 (blue) versus those identified only in the meta-analysis (black). **(c)** Comparisons of effect sizes in KCPS2 and study-specific effect sizes for the lead variants at the 12,224 meta-analysis genome-wide significant loci. The solid lines were estimated by spline smoothing from generalized additive model (b) or linear regression model (c). Full meta-analysis results are shown in Table S6–7.

**Figure 4 |. Genetic architecture of complex traits across KCPS2, BBJ, TWB, and UKB.**
**(a)** The dots represent posterior means and horizontal bars represent standard errors of the parameters for each trait. The vertical dashed line shows the median of the estimates across traits. Full results are shown in Table S8. **(b-e)** Pearson correlations of SNP-heritability between KCPS2 and KoGES **(b)**, BBJ (c), TWB (d), and UKB (e) across the traits shown in a, except for TWB. For heritability in TWB shown in (d), we used the heritability estimates reported by Chen and colleagues. The comparisons between KCPS2 heritability estimates and TWB heritability estimates using SbayesS and KCSP2 LD matrix are shown in Table S6b. Data are presented as posterior means of SNP-heritability. The trait categories are indicated by different colors labeled with their trait names. For the 8 traits available in all five studies, we observed high correlations of heritability between KCPS2 and the other biobanks: KoGES (Pearson correlation r=0.99, 95% confidence interval [CI]: 0.97–1.00), BBJ (r=0.93, 95% CI: 0.64–0.99), TWB (r=0.93, 95% CI: 0.65–0.99), and UKB (r=0.97, 95% CI: 0.82–0.99).

**Figure 5 |. Fine-mapping and colocalization analysis of *ALDH2* region in KCPS2.**
**(a)** Association between *ALDH2* (12q24.12) and alcohol intake in KCPS2. Colors in the Manhattan panels represent r2 values to the lead variant rs671. In the posterior inclusion probability (PIP) panels, only fine-mapped variants in the 95% credible sets (CS) are colored. Heatmap represents significant variants (P<5.0×10⁻⁸) for the other quantitative traits. **(b)** Colocalization analysis between alcohol intake and seven traits that showed PIP>0.9 for rs671 was done in the same region. Coloc.PP4 represents the posterior probability of colocalization at the specified region. All seven traits shown here had coloc.PP4=1 with alcohol intake. Each regional plot shows associations of each locus for the most significantly associated trait, which was all rs671 with PIP>0.9. Full fine-mapping and colocalization analysis results are shown in Table S10.

See this image and copyright information in PMC

References

1. Tam V, Patel N, Turcotte M, Bossé Y, Paré G, Meyre D. Benefits and limitations of genome-wide association studies. Nat Rev Genet. 2019. Aug;20(8):467–484. - PubMed
1. Abdellaoui A, Yengo L, Verweij KJH, Visscher PM. 15 years of GWAS discovery: Realizing the promise. Am J Hum Genet. 2023. Feb 2;110(2):179–194. - PMC - PubMed
1. Chatterjee N, Shi J, García-Closas M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat Rev Genet. Nature Publishing Group; 2016. Jul;17(7):392–406. - PMC - PubMed
1. Lambert SA, Abraham G, Inouye M. Towards clinical utility of polygenic risk scores. Hum Mol Genet. 2019. Nov 21;28(R2):R133–R142. - PubMed
1. Fatumo S, Chikowore T, Choudhury A, Ayub M, Martin AR, Kuchenbaecker K. A roadmap to increase diversity in genomic studies. Nat Med. Nature Publishing Group; 2022. Feb;28(2):243–250. - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

This is a preprint.

Genome-wide association studies in a large Korean cohort identify novel quantitative trait loci for 36 traits and illuminate their genetic architectures

Affiliations

Genome-wide association studies in a large Korean cohort identify novel quantitative trait loci for 36 traits and illuminate their genetic architectures

Authors

Affiliations

Update in

Abstract

Figures

Similar articles

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous