Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Meta-Analysis
. 2023 Jul;66(7):1273-1288.
doi: 10.1007/s00125-023-05912-9. Epub 2023 May 6.

The power of TOPMed imputation for the discovery of Latino-enriched rare variants associated with type 2 diabetes

Alicia Huerta-Chagoya  1   2   3   4 Philip Schroeder  5   6   7 Ravi Mandla  5   6   7 Aaron J Deutsch  5   6   7   8 Wanying Zhu  9 Lauren Petty  9 Xiaoyan Yi  10 Joanne B Cole  5   6   11   12   13 Miriam S Udler  6 Peter Dornbos  5 Bianca Porneala  14 Daniel DiCorpo  15 Ching-Ti Liu  15 Josephine H Li  5   6   7   8 Lukasz Szczerbiński  5   6   7   16   17 Varinderpal Kaur  7 Joohyun Kim  18 Yingchang Lu  18 Alicia Martin  5   19 Decio L Eizirik  10   20 Piero Marchetti  21 Lorella Marselli  21 Ling Chen  6 Shylaja Srinivasan  22 Jennifer Todd  23 Jason Flannick  5   24 Rose Gubitosi-Klug  25 Lynne Levitsky  26 Rachana Shah  27 Megan Kelsey  28 Brian Burke  29 Dana M Dabelea  30 Jasmin Divers  31 Santica Marcovina  32 Lauren Stalbow  33 Ruth J F Loos  33   34 Burcu F Darst  35 Charles Kooperberg  35 Laura M Raffield  36 Christopher Haiman  37   38 Quan Sun  39 Joseph B McCormick  40 Susan P Fisher-Hoch  40 Maria L Ordoñez  41 James Meigs  5   8   14 Leslie J Baier  42 Clicerio González-Villalpando  43 Maria Elena González-Villalpando  43 Lorena Orozco  44 Lourdes García-García  45 Andrés Moreno-Estrada  46 Mexican BiobankCarlos A Aguilar-Salinas  47 Teresa Tusié  48   41 Josée Dupuis  15 Maggie C Y Ng  18 Alisa Manning  49 Heather M Highland  50 Miriam Cnop  10   51 Robert Hanson  52 Jennifer Below  9 Jose C Florez  5   6   7   8   11   53 Aaron Leong  5   6   7   8   14   53 Josep M Mercader  54   55   56   57
Affiliations
Meta-Analysis

The power of TOPMed imputation for the discovery of Latino-enriched rare variants associated with type 2 diabetes

Alicia Huerta-Chagoya et al. Diabetologia. 2023 Jul.

Abstract

Aims/hypothesis: The Latino population has been systematically underrepresented in large-scale genetic analyses, and previous studies have relied on the imputation of ungenotyped variants based on the 1000 Genomes (1000G) imputation panel, which results in suboptimal capture of low-frequency or Latino-enriched variants. The National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) released the largest multi-ancestry genotype reference panel representing a unique opportunity to analyse rare genetic variations in the Latino population. We hypothesise that a more comprehensive analysis of low/rare variation using the TOPMed panel would improve our knowledge of the genetics of type 2 diabetes in the Latino population.

Methods: We evaluated the TOPMed imputation performance using genotyping array and whole-exome sequence data in six Latino cohorts. To evaluate the ability of TOPMed imputation to increase the number of identified loci, we performed a Latino type 2 diabetes genome-wide association study (GWAS) meta-analysis in 8150 individuals with type 2 diabetes and 10,735 control individuals and replicated the results in six additional cohorts including whole-genome sequence data from the All of Us cohort.

Results: Compared with imputation with 1000G, the TOPMed panel improved the identification of rare and low-frequency variants. We identified 26 genome-wide significant signals including a novel variant (minor allele frequency 1.7%; OR 1.37, p=3.4 × 10-9). A Latino-tailored polygenic score constructed from our data and GWAS data from East Asian and European populations improved the prediction accuracy in a Latino target dataset, explaining up to 7.6% of the type 2 diabetes risk variance.

Conclusions/interpretation: Our results demonstrate the utility of TOPMed imputation for identifying low-frequency variants in understudied populations, leading to the discovery of novel disease associations and the improvement of polygenic scores.

Data availability: Full summary statistics are available through the Common Metabolic Diseases Knowledge Portal ( https://t2d.hugeamp.org/downloads.html ) and through the GWAS catalog ( https://www.ebi.ac.uk/gwas/ , accession ID: GCST90255648). Polygenic score (PS) weights for each ancestry are available via the PGS catalog ( https://www.pgscatalog.org , publication ID: PGP000445, scores IDs: PGS003443, PGS003444 and PGS003445).

Keywords: GWAS meta-analysis; Latino population; Polygenic score; TOPMed imputation; Type 2 diabetes.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
General overview of the study. Six cohorts of admixed Latino ancestry, representing a total of 8150 type 2 diabetes cases and 10,735 controls, were imputed with the TOPMed and 1000G Phase 3 panels (grey box). A type 2 diabetes GWAS meta-analysis of the imputed variants resulted in the identification of two novel loci, which were tested for replication in six additional Latino cohorts (green box). They were also interrogated for association with a collection of phenotypes in eight Latino cohorts (blue box) and for functional evidence in multiple available resources (purple box). The generated Latino type 2 diabetes GWAS data were used, in combination with GWAS from other ancestries, to construct ancestry-specific and multi-ancestry type 2 diabetes PSs (brown box). CMDK, Common Metabolic Disease Knowledge; sum stats, summary statistics
Fig. 2
Fig. 2
Performance of the TOPMed reference panel for the imputation of Latino samples. (a) Number of chromosome-wide well-imputed variants (imputation r2≥0.8) by AF for each analysed cohort when using the 1000GP3 (blue) or the TOPMed (black) reference panels. (b) Average chromosome-wide imputation quality by AF for each analysed cohort when using the 1000GP3 (blue) or the TOPMed (black) reference panels. (c) Effective sample size required for reaching 80% statistical power to detect genome-wide significant signals at different effect sizes (OR). The dotted lines show the discovery effective sample size of this study. (d) Percentage of the exome sequenced variants in chromosome 22 that could be imputed when using the 1000GP3 (blue) or the TOPMed (black) reference panels. (e) Percentage of the exome sequenced LoF and deleterious predicted variants based on CADD score in chromosome 22 that could be imputed when using the 1000GP3 (blue) or the TOPMed (black) reference panels
Fig. 3
Fig. 3
Type 2 diabetes GWAS meta-analysis in the Latino population. (a) Manhattan plot of the meta-analysis association statistics, highlighting the loci with genome-wide significance (red) or sub-genome-wide significance (orange) for type 2 diabetes. (b) Regional association plot of the novel ORC5/LHFPL3 locus associated with type 2 diabetes risk. (c) Forest plot of the GWAS association statistics for the novel ORC5/LHFPL3 locus in the discovery (black), the replication (blue) and overall (red) cohorts
Fig. 4
Fig. 4
Sub-genome-wide significant HDAC2 novel type 2 diabetes loci. (a) Scatter plot of the effect allele frequencies (EAFs) from the sub-genome-wide significant variants in Latino (LAT) vs European (EUR) populations, highlighting those that are distinct from the known lead type 2 diabetes-associated variants (purple) and those that are in novel loci (yellow). (b) Regional association plot of the novel HDAC2 locus associated with type 2 diabetes risk. (c) Forest plot of the association statistics in the discovery (black) and the replication (blue) cohorts. (d) Violin plots of serum 3-hydroxybutyrate levels in non-carriers (blue) and carriers (yellow) of the rs1016378028 variant. Whiskers range from upper and lower fences (1.5 × IQR); points represent outliers. (e) HDAC2 gene expression in human islets from donors with type 1 and type 2 diabetes and control islets treated (brown) or not (green) with different cytokines or other stressor compounds. **p<0.01, ***p<0.001 vs no treatment (adjusted p values, Benjamini–Hochberg correction). Whiskers range from upper and lower fences (1.5 × IQR); points represent outliers. (f) HDAC2 gene expression in multiple tissues from GTEx and TIGER portals. Each box plot shows expression in a different tissue or cell line Whiskers range from upper and lower fences (1.5 × IQR); points represent outliers
Fig. 5
Fig. 5
PS for the risk of type 2 diabetes in Latino population. (a) Variance explained by a PS using these Latino GWAS association statistics (LAT, green), the AGEN East Asian GWAS association statistics (EAS, grey), the DIAMANTE European GWAS association statistics (EUR, red), a combination of DIAMANTE European and AGEN East Asian GWAS association statistics (yellow) and a combination of DIAMANTE European, AGEN East Asian and these Latino GWAS association statistics (blue). METSB was used as the testing cohort. (b) Receiver operating characteristic curves for the type 2 diabetes risk prediction explained by a model including sex, age and ten PCs of ancestry (black), a model including covariates and a PS constructed using DIAMANTE European (red) and a model including covariates and a PS constructed using a combination of DIAMANTE European, AGEN East Asian and these Latino GWAS association statistics (blue). (c) Distribution of a multi-ancestry PS using a combination of DIAMANTE European, AGEN East Asian and these Latino GWAS association statistics in type 2 diabetes cases (blue) and controls (black). The table shows the OR per SD attributed to the multi-ancestry PS, as well as the OR for high-risk individuals

References

    1. Williams Amy AL, Jacobs Suzanne SBR, Moreno-Macías H, et al. Sequence variants in SLC16A11 are a common risk factor for type 2 diabetes in Mexico. Nature. 2014;506(7486):97–101. doi: 10.1038/nature12828. - DOI - PMC - PubMed
    1. The SIGMA Type 2 Diabetes Consortium. Estrada K, Aukrust I, et al. Association of a low-frequency variant in HNF1A with type 2 diabetes in a Latino population. JAMA. 2014;311(22):2305–2314. doi: 10.1001/jama.2014.6511. - DOI - PMC - PubMed
    1. Mercader JM, Liao RG, Bell AD, et al. A loss-of-function splice acceptor variant in IGF2 is protective for type 2 diabetes. Diabetes. 2017;66(11):2903–2914. doi: 10.2337/db17-0187. - DOI - PMC - PubMed
    1. Rusu V, Rusu V, Hoch E, et al. Type 2 diabetes variants disrupt function of SLC16A11 through two distinct mechanisms. Cell. 2017;170(1):199–212. doi: 10.1016/j.cell.2017.06.011. - DOI - PMC - PubMed
    1. Mercader JM, Florez JC. The genetic basis of type 2 diabetes in Hispanics and Latin Americans: challenges and opportunities. Front Public Health. 2017;5:239. doi: 10.3389/fpubh.2017.00329. - DOI - PMC - PubMed

Publication types

Grants and funding