Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Oct 12;2(10):100180.
doi: 10.1016/j.xgen.2022.100180.

Best practices for multi-ancestry, meta-analytic transcriptome-wide association studies: Lessons from the Global Biobank Meta-analysis Initiative

Affiliations

Best practices for multi-ancestry, meta-analytic transcriptome-wide association studies: Lessons from the Global Biobank Meta-analysis Initiative

Arjun Bhattacharya et al. Cell Genom. .

Abstract

The Global Biobank Meta-analysis Initiative (GBMI), through its diversity, provides a valuable opportunity to study population-wide and ancestry-specific genetic associations. However, with multiple ascertainment strategies and multi-ancestry study populations across biobanks, GBMI presents unique challenges in implementing statistical genetics methods. Transcriptome-wide association studies (TWASs) boost detection power for and provide biological context to genetic associations by integrating genetic variant-to-trait associations from genome-wide association studies (GWASs) with predictive models of gene expression. TWASs present unique challenges beyond GWASs, especially in a multi-biobank, meta-analytic setting. Here, we present the GBMI TWAS pipeline, outlining practical considerations for ancestry and tissue specificity, meta-analytic strategies, and open challenges at every step of the framework. We advise conducting ancestry-stratified TWASs using ancestry-specific expression models and meta-analyzing results using inverse-variance weighting, showing the least test statistic inflation. Our work provides a foundation for adding transcriptomic context to biobank-linked GWASs, allowing for ancestry-aware discovery to accelerate genomic medicine.

PubMed Disclaimer

Conflict of interest statement

DECLARATION OF INTERESTS The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
Challenges in multi-ancestry, meta-analytic TWASs Each level of data in a TWAS introduces a set of challenges: (1) genetics data include confounding from genetic ancestry, population structure and relatedness, and complex linkage disequilibrium patterns, (2) gene expression data introduces context-specific factors, such as tissue-, cell-type-, or cell-state-specific expression, and (3) phenotypic data involve challenges in acquiring and aggregating phenotypes, properly defining controls for phenotypes, and ascertainment and selection bias from non-random sampling.
Figure 2
Figure 2
Comparison of predictive performance of expression prediction models across ancestry (A) Adjusted R2 difference (y axis) when predicting expression in the AFR imputation sample between models trained in EUR and AFR training samples across tissue (x axis). Proportion of models with improved R2 using ancestry-aligned models versus ancestry-mismatched models is labeled. (B) Adjusted R2 difference between ancestry-specific and ancestry-unaware models imputing into EUR (left) and AFR (right) samples.
Figure 3
Figure 3
Comparison of meta-analytic strategies for multi-biobank, multi-ancestry TWASs (A) Per-ancestry meta-analyzed TWAS scores across EUR (x axis) versus AFR ancestry (y axis). The dotted lines indicate p < 2.5×10−6 with a 45-degree line for reference. Points are colored by which ancestry population the TWAS association meets p < 2.5×10−6. (B) QQ-plot of TWAS Z scores , colored by meta-analytic strategies. Per ancestry refers to TWAS meta-analysis across meta-analyzed ancestry-specific GWAS summary statistics. Per bank/per ancestry refers to TWAS meta-analysis using all biobank- and ancestry-specific GWAS summary statistics. (C) Effect sizes and Bonferroni-corrected confidence intervals (CIs) for TWAS associations across 17 individual biobanks (EUR in teal, AFR in red) and two IVW meta-analysis strategies (in yellow) for five representative genes. The Higgins-Thompson I2 statistic is provided.
Figure 4
Figure 4
GReX-PheWAS for categorizing phenome-wide associations for TAF7 genetically regulated expression in UKBB (A) –log10 Benjamini-Hochberg FDR-adjusted p values of GTAs (y axis) across nine phenotype groups (x axis). Dotted line shows FDR-adjusted p = 0.05. (B) Miami plot of TWAS Z scores (y axis) across phenotypes (x axis), colored by phecode group. Dotted line shows Benjamini-Hochberg FDR-corrected significance, and phenotypes are labeled if the association passes Bonferroni correction.

Similar articles

  • Global Biobank Meta-analysis Initiative: Powering genetic discovery across human disease.
    Zhou W, Kanai M, Wu KH, Rasheed H, Tsuo K, Hirbo JB, Wang Y, Bhattacharya A, Zhao H, Namba S, Surakka I, Wolford BN, Lo Faro V, Lopera-Maya EA, Läll K, Favé MJ, Partanen JJ, Chapman SB, Karjalainen J, Kurki M, Maasha M, Brumpton BM, Chavan S, Chen TT, Daya M, Ding Y, Feng YA, Guare LA, Gignoux CR, Graham SE, Hornsby WE, Ingold N, Ismail SI, Johnson R, Laisk T, Lin K, Lv J, Millwood IY, Moreno-Grau S, Nam K, Palta P, Pandit A, Preuss MH, Saad C, Setia-Verma S, Thorsteinsdottir U, Uzunovic J, Verma A, Zawistowski M, Zhong X, Afifi N, Al-Dabhani KM, Al Thani A, Bradford Y, Campbell A, Crooks K, de Bock GH, Damrauer SM, Douville NJ, Finer S, Fritsche LG, Fthenou E, Gonzalez-Arroyo G, Griffiths CJ, Guo Y, Hunt KA, Ioannidis A, Jansonius NM, Konuma T, Lee MTM, Lopez-Pineda A, Matsuda Y, Marioni RE, Moatamed B, Nava-Aguilar MA, Numakura K, Patil S, Rafaels N, Richmond A, Rojas-Muñoz A, Shortt JA, Straub P, Tao R, Vanderwerff B, Vernekar M, Veturi Y, Barnes KC, Boezen M, Chen Z, Chen CY, Cho J, Smith GD, Finucane HK, Franke L, Gamazon ER, Ganna A, Gaunt TR, Ge T, Huang H, Huffman J, Katsanis N, Koskela JT, Lajonchere C, Law MH, Li L, Lindgren CM, Loos RJF, MacGregor S, Matsuda K, Olsen CM… See abstract for full author list ➔ Zhou W, et al. Cell Genom. 2022 Oct 12;2(10):100192. doi: 10.1016/j.xgen.2022.100192. eCollection 2022 Oct 12. Cell Genom. 2022. PMID: 36777996 Free PMC article.
  • METRO: Multi-ancestry transcriptome-wide association studies for powerful gene-trait association detection.
    Li Z, Zhao W, Shang L, Mosley TH, Kardia SLR, Smith JA, Zhou X. Li Z, et al. Am J Hum Genet. 2022 May 5;109(5):783-801. doi: 10.1016/j.ajhg.2022.03.003. Epub 2022 Mar 24. Am J Hum Genet. 2022. PMID: 35334221 Free PMC article.
  • Global Biobank analyses provide lessons for developing polygenic risk scores across diverse cohorts.
    Wang Y, Namba S, Lopera E, Kerminen S, Tsuo K, Läll K, Kanai M, Zhou W, Wu KH, Favé MJ, Bhatta L, Awadalla P, Brumpton B, Deelen P, Hveem K, Lo Faro V, Mägi R, Murakami Y, Sanna S, Smoller JW, Uzunovic J, Wolford BN; Global Biobank Meta-analysis Initiative; Willer C, Gamazon ER, Cox NJ, Surakka I, Okada Y, Martin AR, Hirbo J. Wang Y, et al. Cell Genom. 2023 Jan 4;3(1):100241. doi: 10.1016/j.xgen.2022.100241. eCollection 2023 Jan 11. Cell Genom. 2023. PMID: 36777179 Free PMC article.
  • Multivariate adaptive shrinkage improves cross-population transcriptome prediction and association studies in underrepresented populations.
    Araujo DS, Nguyen C, Hu X, Mikhaylova AV, Gignoux C, Ardlie K, Taylor KD, Durda P, Liu Y, Papanicolaou G, Cho MH, Rich SS, Rotter JI; NHLBI TOPMed Consortium; Im HK, Manichaikul A, Wheeler HE. Araujo DS, et al. HGG Adv. 2023 Jul 1;4(4):100216. doi: 10.1016/j.xhgg.2023.100216. eCollection 2023 Oct 12. HGG Adv. 2023. PMID: 37869564 Free PMC article.
  • Biobanking as a Tool for Genomic Research: From Allele Frequencies to Cross-Ancestry Association Studies.
    Lazareva TE, Barbitoff YA, Changalidis AI, Tkachenko AA, Maksiutenko EM, Nasykhova YA, Glotov AS. Lazareva TE, et al. J Pers Med. 2022 Dec 9;12(12):2040. doi: 10.3390/jpm12122040. J Pers Med. 2022. PMID: 36556260 Free PMC article. Review.

Cited by

References

    1. Abul-Husn N.S., Kenny E.E. Personalized medicine and the power of electronic health records. Cell. 2019;177:58–69. - PMC - PubMed
    1. Swede H., Stone C.L., Norwood A.R. National population-based biobanks for genetic research. Genet. Med. 2007;9:141–149. - PubMed
    1. Zhou W., Kanai M., Wu K.-H.H., Humaira R., Tsuo K., Hirbo J.B., Wang Y., Bhattacharya A., Zhao H., Namba S., et al. Global Biobank Meta-analysis Initiative: powering genetic discovery across human diseases. medRxiv. 2021;27 doi: 10.1101/2021.11.19.21266436. Preprint at. - DOI - PMC - PubMed
    1. Gallagher M.D., Chen-Plotkin A.S. The post-GWAS era: from association to function. Am. J. Hum. Genet. 2018;102:717–730. - PMC - PubMed
    1. Wijmenga C., Zhernakova A. The importance of cohort studies in the post-GWAS era. Nat. Genet. 2018;50:322–328. - PubMed