Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Sep 29:2023.09.28.23296244.
doi: 10.1101/2023.09.28.23296244.

Rare variant association analysis in 51,256 type 2 diabetes cases and 370,487 controls informs the spectrum of pathogenicity of monogenic diabetes genes

Affiliations

Rare variant association analysis in 51,256 type 2 diabetes cases and 370,487 controls informs the spectrum of pathogenicity of monogenic diabetes genes

Philip Schroeder et al. medRxiv. .

Update in

Abstract

We meta-analyzed array data imputed with the TOPMed reference panel and whole-genome sequence (WGS) datasets and performed the largest, rare variant (minor allele frequency as low as 5×10-5) GWAS meta-analysis of type 2 diabetes (T2D) comprising 51,256 cases and 370,487 controls. We identified 52 novel variants at genome-wide significance (p<5 × 10-8), including 8 novel variants that were either rare or ancestry-specific. Among them, we identified a rare missense variant in HNF4A p.Arg114Trp (OR=8.2, 95% confidence interval [CI]=4.6-14.0, p = 1.08×10-13), previously reported as a variant implicated in Maturity Onset Diabetes of the Young (MODY) with incomplete penetrance. We demonstrated that the diabetes risk in carriers of this variant was modulated by a T2D common variant polygenic risk score (cvPRS) (carriers in the top PRS tertile [OR=18.3, 95%CI=7.2-46.9, p=1.2×10-9] vs carriers in the bottom PRS tertile [OR=2.6, 95% CI=0.97-7.09, p = 0.06]. Association results identified eight variants of intermediate penetrance (OR>5) in monogenic diabetes (MD), which in aggregate as a rare variant PRS were associated with T2D in an independent WGS dataset (OR=4.7, 95% CI=1.86-11.77], p = 0.001). Our data also provided support evidence for 21% of the variants reported in ClinVar in these MD genes as benign based on lack of association with T2D. Our work provides a framework for using rare variant imputation and WGS analyses in large-scale population-based association studies to identify large-effect rare variants and provide evidence for informing variant pathogenicity.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
T2D GWAS discovery and overall analysis approach. (a) Overview of the cohorts, sample size, along with the pre-processing steps, for each cohort included in the T2D GWAS meta-analysis. (b) Manhattan plots for variants with an overall study MAF > 0.001 (left) and MAF < 0.001 (right). Pink represents variants that reached genome-wide significance (p < 5×10−8). (c) Odds ratio for all genome-wide significant conditionally independent variants plotted across MAF. Novel and known signals are represented in purple and green, respectively, and primary and secondary signals as stars or circles. (d) Overview of the downstream analysis that uses rare variant meta-analysis GWAS results to inform the classification of variants in MD genes within several ClinVar categories. We selected all the variants observed in ClinVar in genes involved in monogenic diabetes. For those that are present in our meta-analysis, we categorized variants as “supporting moderate pathogenic” (mod. path), “supporting benign” and “inconclusive” according to the odds ratio and confidence interval. Variants with sufficient carriers within the mod. path category were stratified in tertiles based on their common variant PRS (cvPRS). Finally, we validated the identified mod. path variants in an external dataset (All of Us), comparing the aggregate effect of the carrier status in rare variant PRSs.
Figure 2.
Figure 2.
Functional characterization of two novel low-frequency variants associated with T2D. a,b) LocusZoom plots for rs373676665 (a) and rs147287548 (e) regions. Each point represents a variant, with its p (on a −log10 scale, y-axis) derived from the meta-analysis association results. The x-axis represents the genomic position (GRC38). (b,f) Representation of chromatin accessibility (ATAC-seq), H3K27ac, and H3K4me1 ChIP-seq signal coverage in T2D-relevant tissues. (b) The box with the dashed line highlights the chromatin fragment that contains rs147287548, which shows significant long-range chromatin interactions with the promoter of the LEP gene in mesenchymal stem cells and throughout in vitro adipogenesis. The wider chromatin landscape of this locus and chromatin interactions detected by enhancer-capture HiC are shown in Supplementary Figure 7. Details of the datasets shown are provided in Supplementary Table 7. (c) and (g) Transcription factor motif disruption results. The minor alleles of rs373676665 (c) and rs147287548 (g) are predicted to disrupt binding sites for PPARalpha and NFTAc, respectively. (d) Forest plots showing the carrier counts and odds ratios of rs373676665 (d) and rs147287548 (h). Odds ratios are denoted by boxes proportional to the size of the cohort with 95% CI error bars.
Figure 3.
Figure 3.
Classification of variants in MD genes from ClinVar and development of rare-variant PRS. (a) Overview of variant classification according to meta-analyses results in UKB/GERA/MGBB (excluding All of Us). We extracted variants in MD genes from ClinVar labeled as “uncertain significance”, “conflicting interpretations of pathogenicity”, “likely benign”, or “likely pathogenic”. We then classified these variants based on the UKB/GERA/MGBB meta-analysis odds ratio (OR) and 95% confidence interval (CI) lower bound (LB) and upper bound (UB). Variants with a meta-analytic OR > 5 and an OR 95% LB > 2 are classified as “supports moderately pathogenic” (red). Variants with an OR 95% UB < 2 are classified as “supports benign” (green). Variants with an OR 95% UB > 2 and LB < 2 are classified as “inconclusive” (blue). (b) Results of this analysis for variants of “conflicting interpretations of pathogenicity” and “uncertain significance” according to ClinVar. The x-axis represents MAF, and odds ratios are represented in the y-axis. Only variants with MAF<0.001 were considered for this analysis. (c) Variants identified as “moderately pathogenic” were aggregated in a rare variant polygenic risk score (rvPRS), a weighted sum of the effect alleles, and tested in AoU. OR for each unit of the rvPRS is represented in addition to the OR associated with being a carrier of any of the risk alleles (Carrier). The forest plots are represented for the group of variants classified as “supports benign”, “supporting moderately pathogenic” and “inconclusive”.
Figure 4.
Figure 4.
Effect of moderately pathogenic variants versus confirmed pathogenic MODY variants for diabetes risk and on related clinical variables. (a-c) Forest plots showing the effect of p.Arg114Trp, p.Pro475Leu, and p.Val455Glu, stratified by common variant PRS (cvPRS) tertiles. Odds ratios (OR) are denoted by boxes proportional to the size of the cvPRS subgroup and 95% CI error bars. OR are relative to non-carriers in the middle tertile of the cvPRS. On the top of each forest plot, the effect of being a carrier for a confirmed pathogenic variant for HNF4A (a), HNF1A (b), and GCK (c) MODY is also represented, using data identified via exome sequencing in UK Biobank. For each effect estimate, the diabetes case definition included individuals with type 1 or type 2 diabetes. Non-carriers with PRS in the middle tertile were treated as the reference group. (d, e,f) Boxplots of HbA1c (%), random glucose (mg/dL), and BMI (kg/m2) in diabetes cases (red) and non-cases (blue) among non-carriers (left), carriers of a moderately pathogenic variant (middle), and carriers of confirmed pathogenic MODY variants (right) in HNF4A (d), HNF1A (e), and GCK (f). The covariate-adjusted p is included for comparisons with significant differences (p < 0.05) between groups.

References

    1. Vujkovic M. et al. Discovery of 318 new risk loci for type 2 diabetes and related vascular outcomes among 1.4 million participants in a multi-ancestry meta-analysis. Nat Genet (2020). - PMC - PubMed
    1. Mahajan A. et al. Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation. Nat Genet 54, 560–572 (2022). - PMC - PubMed
    1. Suzuki K. et al. Multi-ancestry genome-wide study in &gt;2.5 million individuals reveals heterogeneity in mechanistic pathways of type 2 diabetes and complications. medRxiv, 2023.03.31.23287839 (2023).
    1. Spracklen C.N. et al. Identification of type 2 diabetes loci in 433,540 East Asian individuals. Nature 582, 240–245 (2020). - PMC - PubMed
    1. Huerta-Chagoya A. et al. The power of TOPMed imputation for the discovery of Latino-enriched rare variants associated with type 2 diabetes. Diabetologia (2023). - PMC - PubMed

Publication types