Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Meta-Analysis
. 2025 May 28;16(1):4935.
doi: 10.1038/s41467-025-59950-5.

Genome-wide association studies in a large Korean cohort identify quantitative trait loci for 36 traits and illuminate their genetic architectures

Affiliations
Meta-Analysis

Genome-wide association studies in a large Korean cohort identify quantitative trait loci for 36 traits and illuminate their genetic architectures

Yon Ho Jee et al. Nat Commun. .

Abstract

Genome-wide association studies (GWAS) have predominantly focused on European ancestry populations, limiting biological discoveries across diverse populations. Here we report GWAS findings from 153,950 individuals across 36 quantitative traits in the Korean Cancer Prevention Study-II (KCPS2) Biobank. We discovered 301 previously unreported genetic loci in KCPS2, including an association between thyroid-stimulating hormone and CD36. Meta-analysis with the Korean Genome and Epidemiology Study, Biobank Japan, Taiwan Biobank, and UK Biobank identified 4588 loci that were not significant in any contributing GWAS. We describe differences in genetic architectures across these East Asian and European samples. We also highlight East Asian specific associations, including a known pleiotropic missense variant in ALDH2, which fine-mapping identified as a likely causal variant for multiple traits. Our findings provide insights into the genetic architecture of complex traits in East Asian populations and highlight how broadening the population diversity of GWAS samples can aid discovery.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Overview of the Korean Cancer Prevention Study-II Biobank and analysis.
Detailed descriptions of the 36 quantitative traits examined in this study are shown in Supplementary Data 1. After QC, the data were phased using SHAPEIT4 and imputed using IMPUTE5 with 1000 Genomes Project Phase 3 data.
Fig. 2
Fig. 2. GWAS results for 36 quantitative traits in the Korean Cancer Prevention Biobank-II (KCPS2).
a Number of known and novel (“previously unreported”) variants identified in KCPS2 compared to the Open Target Genetics using EFO terms (Supplementary Data 2-S3). The definition of ‘novel association’ is detailed in the Methods. b A summary of genome-wide significant loci associated with the 36 traits in KCPS2. Each locus was mapped to a gene using FUMA with a 1000 Genome Phase 3 East Asian reference panel. We then counted the number of associated traits (out of 36 traits) per gene (Supplementary Data 4). (c) Comparisons of pairwise genetic correlations (rg) between phenotypic correlations (rp) for the 36 traits in KCPS2. rg was estimated using bivariate LDSC based on association test statistics from linear regression. Significant rg and rp after false discovery rate (FDR < 0.05) correction is indicated by purple if both rg and rp were significant, red if only rg was significant, blue if only rp was significant, and gray if neither was significant. The black solid line was estimated by spline smoothing from a linear regression model. The complete set of rg and rp is available in Supplementary Data 5.
Fig. 3
Fig. 3. Meta-analysis of 21 traits across KCPS2, KoGES, BBJ, TWB, and UKB.
a Genome-wide significant loci identified in the meta-analysis, Color of dots indicate significance in meta-analysis (black), KCPS2 (blue), KoGES (orange), BBJ (purple), TWB (light blue), and UKB (green). Multiple dots in a bar represent simultaneous significance in multiple cohorts. b Comparisons of allele frequency and effect sizes in KCPS2 for the genome-wide significant variants discovered only in KCPS2 (blue) versus those identified only in the meta-analysis (black). c Comparisons of effect sizes in KCPS2 and study-specific effect sizes for the lead variants at the 12,224 meta-analysis genome-wide significant loci. The solid lines were estimated by spline smoothing from generalized additive model (b) or linear regression model (c). The error bars (b and c) indicate 95% confidence intervals estimated as ±1.96 × standard error. Full meta-analysis results are shown in Supplementary Data 6, 7.
Fig. 4
Fig. 4. Genetic architecture of complex traits across KCPS2 (n = 153 K), KoGES (n = 72 K), BBJ (n = 179 K), TWB (n = 102 K), and UKB (n = 420 K).
a The dots represent posterior means and horizontal bars represent standard errors of the parameters for each trait. The vertical dashed line shows the median of the estimates across traits. Full results are shown in Supplementary Data 8. Pearson correlations of SNP-heritability between KCPS2 and KoGES (b), BBJ (c), TWB (d), and UKB (e) across the traits shown in a, except for TWB. For heritability in TWB shown in (d), we used the heritability estimates reported by Chen et al.. The comparisons between KCPS2 heritability estimates and TWB heritability estimates using SbayesS and KCSP2 LD matrix are shown in Supplementary Fig. 6b. Data are presented as posterior means of SNP-heritability. The trait categories are indicated by different colors labeled with their trait names. For the eight traits available in all five studies, we observed high correlations of heritability between KCPS2 and the other biobanks: KoGES (Pearson correlation r = 0.99, 95% confidence interval [CI]: 0.97–1.00), BBJ (r = 0.93, 95% CI: 0.64–0.99), TWB (r = 0.93, 95% CI: 0.65–0.99), and UKB (r = 0.97, 95% CI: 0.82–0.99). All Pearson correlation tests were two-sided, and the reported p-values were not corrected for multiple testing.
Fig. 5
Fig. 5. Fine-mapping and colocalization analysis of ALDH2 region in KCPS2.
a Association between ALDH2 (12q24.12) and alcohol intake in KCPS2, estimated using a linear mixed model. Colors in the Manhattan panels represent r2 values to the lead variant rs671. In the posterior inclusion probability (PIP) panels, derived from SuSiE fine-mapping, only fine-mapped variants in the 95% credible sets (CS) are colored. The heatmap represents significant GWAS variants (P < 5.0 × 10−8) identified from linear mixed model analyses for the other quantitative traits. b Colocalization analysis (performed using coloc) between alcohol intake and seven traits that showed PIP > 0.9 for rs671 was done in the same region. All seven traits shown here had a posterior probability of colocalization at the specified region of 1 with alcohol intake. Each regional plot shows associations of each locus for the most significantly associated trait, which was all rs671 with PIP > 0.9. Full fine-mapping and colocalization analysis results are shown in Supplementary Data 10. All reported −log10(P-values) represent two-sided tests and uncorrected for multiple testing.

Update of

References

    1. Tam, V. et al. Benefits and limitations of genome-wide association studies. Nat. Rev. Genet.20, 467–484 (2019). - DOI - PubMed
    1. Abdellaoui, A., Yengo, L., Verweij, K. J. H. & Visscher, P. M. 15 years of GWAS discovery: Realizing the promise. Am. J. Hum. Genet.110, 179–194 (2023). - DOI - PMC - PubMed
    1. Chatterjee, N., Shi, J. & García-Closas, M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat. Rev. Genet.17, 392–406 (2016). - DOI - PMC - PubMed
    1. Lambert, S. A., Abraham, G. & Inouye, M. Towards clinical utility of polygenic risk scores. Hum. Mol. Genet.28, R133–R142 (2019). - DOI - PubMed
    1. Fatumo, S. et al. A roadmap to increase diversity in genomic studies. Nat. Med.28, 243–250 (2022). - DOI - PMC - PubMed

Publication types

Substances

Supplementary concepts

LinkOut - more resources