Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Feb 7;387(6734):eadp4753.
doi: 10.1126/science.adp4753. Epub 2025 Feb 7.

Kidney multiome-based genetic scorecard reveals convergent coding and regulatory variants

Hongbo Liu  1   2   3   4 Amin Abedini  1   2   3 Eunji Ha  1   2   3 Ziyuan Ma  1   2   3 Xin Sheng  1   2   3   5   6 Bernhard Dumoulin  1   2   3 Chengxiang Qiu  1   2   3 Tamas Aranyi  1   2   3   7   8 Shen Li  1   2   3 Nicole Dittrich  1   2   3   9 Hua-Chang Chen  10 Ran Tao  10   11 Der-Cherng Tarng  12   13 Feng-Jen Hsieh  14 Shih-Ann Chen  15   16   17   18 Shun-Fa Yang  19   20 Mei-Yueh Lee  21   22   23 Pui-Yan Kwok  14   24 Jer-Yuarn Wu  14 Chien-Hsiun Chen  14 Atlas Khan  25 Nita A Limdi  26 Wei-Qi Wei  27 Theresa L Walunas  28 Elizabeth W Karlson  29 Eimear E Kenny  30   31   32 Yuan Luo  33 Leah Kottyan  34 John J Connolly  35 Gail P Jarvik  36 Chunhua Weng  37 Ning Shang  25 Joanne B Cole  38   39   40   41   42 Josep M Mercader  38   39   40   43 Ravi Mandla  38   39   43   44   45 Timothy D Majarian  38   46 Jose C Florez  38   39   40   43 Mary E Haas  47 Luca A Lotta  47 Regeneron Genetics Center‡GHS-RGC DiscovEHR Collaboration§Theodore G Drivas  48 Penn Medicine BioBank¶Ha My T Vy  49   50 Girish N Nadkarni  49   50   51   52 Laura K Wiley  53   54 Melissa P Wilson  54 Christopher R Gignoux  53   54 Humaira Rasheed  55   56 Laurent F Thomas  55   57   58 Bjørn Olav Åsvold  55   59 Ben M Brumpton  55   56   60 Stein I Hallan  57   61 Kristian Hveem  55 Jie Zheng  56   62   63 Jacklyn N Hellwege  11   64 Matthew Zawistowski  65 Sebastian Zöllner  65   66 Nora Franceschini  67 Hailong Hu  1   2   3 Jianfu Zhou  1   2   3 Krzysztof Kiryluk  25 Marylyn D Ritchie  3   68 Matthew Palmer  69 Todd L Edwards  70 Benjamin F Voight  3   71   72 Adriana M Hung  73   74 Katalin Susztak  1   2   3   4 Regeneron Genetics CenterGHS-RGC DiscovEHR CollaborationPenn Medicine BioBank
Collaborators, Affiliations

Kidney multiome-based genetic scorecard reveals convergent coding and regulatory variants

Hongbo Liu et al. Science. .

Abstract

Kidney dysfunction is a major cause of mortality, but its genetic architecture remains elusive. In this study, we conducted a multiancestry genome-wide association study in 2.2 million individuals and identified 1026 (97 previously unknown) independent loci. Ancestry-specific analysis indicated an attenuation of newly identified signals on common variants in European ancestry populations and the power of population diversity for further discoveries. We defined genotype effects on allele-specific gene expression and regulatory circuitries in more than 700 human kidneys and 237,000 cells. We found 1363 coding variants disrupting 782 genes, with 601 genes also targeted by regulatory variants and convergence in 161 genes. Integrating 32 types of genetic information, we present the "Kidney Disease Genetic Scorecard" for prioritizing potentially causal genes, cell types, and druggable targets for kidney disease.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The laboratory of Dr. Susztak receives funding from GSK, Regeneron, Gilead, Merck, Boehringer Ingelheim, Bayer, Novartis Maze, Jnana, Ventus and Novo Nordisk. The funders had no influence on the data analysis. Dr. Susztak serves on the SAB of Jnana pharmaceuticals and receives equity. Dr. Ritchie serves on the SAB for Goldfinch Bio and Cipherome. The laboratory of Dr. Walunas receives funding from Gilead Sciences. The other authors declare no competing interests.

Figures

Fig. 1.
Fig. 1.. eGFRcrea GWAS of 2.2 million individuals.
(A) Manhattan plots of eGFRcrea GWAS of 2,287,877 individuals of multi-ancestry and 1,785,582 individuals of European-ancestry. The x-axis is the chromosomal location of SNP. The y-axis is the strength of association −log10(p value, GWAS meta-analysis) (the scale is capped at 300). Lead SNPs of independent loci are highlighted. Green dots represent loci overlapping or nearby (within 500 kb or LD R2 > 0.001) previously reported sentinel variants, and purple represents loci not reported previously. Prioritized genes for new loci are highlighted. (B) Fraction of known and new independent loci identified in GWAS of multi-ancestry. (C) The number of independent loci with at least one variant validated by eGFRcys and BUN. (D) Number of GWAS loci identified in down-sampled GWAS of European-ancestry. The lines represent loci defined by different locus window sizes (35 kb, 50 kb, and 100 kb). (E) Overlap of credible sets by fine-mapping in eGFRcrea GWAS of 2,287,877 individuals of multi-ancestry and 1,785,582 individuals of European-ancestry. (F) Enrichment of credible set variants in non-synonymous variants, kidney enhancers, and active promoters. The y-axis is the odds ratio, and the chi-square test significance is presented in each bar. (G) Locus-zoom of credible sets by fine-mapping in the IRF5 locus in European-ancestry GWAS. The x-axis is the chromosomal location of SNP. The panels from top to bottom show the original GWAS, posterior inclusion probability, credible set 1, credible set 2, and gene annotations. (H) Pleiotropy maps of eGFRcrea and other traits. The x-axis is the number of colocalization loci between eGFRcrea and related GWAS traits listed on the y-axis.
Fig. 2.
Fig. 2.. Allele-specific expression (ASE) analysis in tubular and glomerular compartments.
(A) Analytical scheme and Manhattan plots of the ASE analyses in tubule and glomeruli compartments. In the Manhattan plot, the x-axis is the chromosomal location of SNP. The y-axis represents the strength of association −log10(p value, ASE) (the scale is capped at 300). (B) Overlap of ASE genes identified in tubule and glomerular compartments using RASQUAL. (C) Number of ASE genes identified by RASQUAL and eQTL genes identified by MatrixeQTL in the same dataset (left panel) and fraction of genes validated by AlleleDB ASEs (right panel). (D) Locus-zoom of eGFRcrea GWAS, tubule ASE, glomeruli ASE, and kidney eQTL in GATM locus. The y-axis represents the strength of association −log10(p value). PP.H4 is the posterior probability of colocalization between ASE and eGFRcrea GWAS. (E) Number of eGFRcrea-associated variants with ASEs/eQTLs (left panel) and the fraction of ASE-only/eQTL-only variants in fine-mapping credible sets of eGFRcrea GWAS (right panel). (F) Number of genes whose ASEs/eQTLs are colocalized with eGFRcrea GWAS (left panel) and the fraction of ASE-only/eQTL-only genes reported as nephropathy genes (right panel). (G) Locus-zoom of eGFRcrea GWAS, tubule ASE, glomeruli ASE, and kidney eQTL in the SLC22A2 locus. The y-axis represents the strength of association −log10(p value). PP.H4 is the posterior probability of colocalization between ASE/eQTL and eGFRcrea GWAS.
Fig. 3.
Fig. 3.. Allele-specific accessibility analysis in human kidneys.
(A) Analytical workflow and Manhattan plot of the bASA analyses in human kidneys. Manhattan plot: the x-axis is the chromosomal location of SNP, and the y-axis is the strength of association −log10(p value, bASA). (B) Enrichment of bASA peaks in kidney chromatin states which were estimated using histone modifications. The x-axis is the log2-transformed odds ratio. The y-axis is enrichment significance −log10(p value). (C) Association of effect sizes between kidney bASAs (x-axis) and meQTLs (y-axis) (Spearman correlation test). (E) Prioritization of GWAS variants and loci using bASAs in conjunction with ASEs and meQTLs. The upper panel represents the analytical workflow to identify bASA and GWAS variants associated with open chromatin and kidney function. The middle panel shows the fraction of bASA and GWAS variants overlapped with ASEs and/or meQTLs. The middle panel also shows the fraction of bASA and GWAS colocalization loci, which also colocalized with ASEs and/or meQTLs. (F) Regulatory elements and locus-zoom of eGFRcrea GWAS, kidney bASA, meQTL, and tubule ASE in USP24 locus. Chromatin states are represented by different colors. In the locus-zoom plots, the y-axis is the strength of association −log10(p value). PP.H4 is the posterior probability of colocalization between eGFRcrea GWAS and molecular traits.
Fig. 4.
Fig. 4.. Single cell-level allele-specific accessibility analysis in human kidneys.
(A) Analytical workflow of snASA analyses in human kidney cell types. (B) Number of cells (x-axis) and snASA SNPs (y-axis) identified in each cell type. (C) Fraction of snASA SNPs within open chromatin peaks. (D) Cell type-specific and shared snASA SNPs (rows) across kidney cell types (columns). The color represents the effect size of ASA from low (blue) to high (red). GWAS variants were marked in the right column. (E) Fraction of cell type-specific and shared snASA SNPs validated by bASA. The significance of shared snASA SNPs with bASA was calculated using the chi-squared test. (F) Association of effect sizes between PT snASAs (x-axis) and kidney bASAs (y-axis). The correlation of effect size was calculated by the Spearman correlation test. (G) Enrichment of cell type-specific snASA SNPs on TFBSs. White block; non-significant. Yellow to red indicates the significance of TF enrichment (enrichment score > 3 and p value < 3.68×10−5, Bonferroni’s threshold accounting for 1,358 tests of 7 cell types and 194 TFs). (H) Fraction of snASA & GWAS variants overlapped with TF binding sites. (I) Examples of snASA & GWAS variants overlapped with cell type-specific open chromatin peaks and TFBS.
Fig. 5.
Fig. 5.. Human kidney single cell multi-omics Open4Gene associations link GWAS variants to target genes.
(A) Analytical workflow of single-nucleus multiome (RNA+ATAC) and Open4Gene analyses in human kidney cells. (B) Number of cells (x-axis) and Open4Gene links (y-axis) identified in each cell type. (C) Fraction of open chromatin peaks targeting unique genes and multiple genes. (D) Prioritization of target genes of GWAS variants using Open4Gene links. (E) eGFRcrea GWAS strength (y-axis) and distance from GWAS variant to Open4Gene peaks (x-axis). (F) The number of GWAS variants identified for each gene using Open4Gene links. The y-axis is the number of GWAS variants, and the x-axis shows the genes with a given number of GWAS variants. Top genes and known kidney disease genes were labeled. (G) Locus-zoom of eGFRcrea GWAS, Open4Gene links, and regulatory elements in the LRP2 locus. In the locus-zoom plot, the y-axis is the strength of association −log10(p value). Chromatin states are represented by different colors. The tracks on the left panel show the open chromatin in each cell type, and the right panel shows the expression of LRP2 in each cell type.
Fig. 6.
Fig. 6.. Kidney Disease Genetic Scorecard indicates the convergence of coding and regulatory variants in GWAS loci.
(A) Analytical workflow of prioritization of eGFRcrea GWAS variants by integrating kidney multi-omics. (B) Prioritization of regulatory variants by integrating kidney multi-omics. The left y-axis is the number (bar chart) of non-coding variants prioritized using each prioritization score cutoff on the x-axis. The right y-axis shows the fraction (red dot plot) of variants targeting unique genes and the fraction (blue dot plot) of validated nephropathy genes prioritized by each prioritization score cutoff on the x-axis. The gray dashed line shows the prioritization threshold that covers all validated nephropathy genes. (C) The number of eGFRcrea GWAS variants located in different genome features (left panel), and enrichment test with log-transformed odds ratio on the x-axis and significance labeled for each feature (right panel). (D) Fraction of coding GWAS variants annotated as pLOF, missense, and synonymous. (E) Number of eGFRcrea GWAS loci prioritized by regulatory variants (prioritization score ≥ 10) and coding variants. (F) Number of genes validated by eGFRcys/BUN, with the convergence of coding and regulatory variants, cell type-specific expression, and targeted by FDA-approved drugs. (G) Kidney Disease Genetic Scorecard of genes with convergence of coding and regulatory variants (prioritization score ≥ 12). This figure only shows the genes with the highest expression and open chromatin on their regulatory variants in the same cell type. The full list is available in table S25.

References

    1. G. B. D. F. Collaborators, Burden of disease scenarios for 204 countries and territories, 2022–2050: a forecasting analysis for the Global Burden of Disease Study 2021. Lancet 403, 2204–2256 (2024). - PMC - PubMed
    1. Kottgen A et al. , New loci associated with kidney function and chronic kidney disease. Nat Genet 42, 376–384 (2010). - PMC - PubMed
    1. Backman JD et al. , Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021). - PMC - PubMed
    1. Barton AR, Sherman MA, Mukamel RE, Loh PR, Whole-exome imputation within UK Biobank powers rare coding variant association and fine-mapping analyses. Nat Genet 53, 1260–1269 (2021). - PMC - PubMed
    1. Wuttke M et al. , Imputation-powered whole-exome analysis identifies genes associated with kidney function and disease in the UK Biobank. Nat Commun 14, 1287 (2023). - PMC - PubMed