Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2025 Feb 21:2025.02.19.25322561.
doi: 10.1101/2025.02.19.25322561.

Cross-cohort analysis of expression and splicing quantitative trait loci in TOPMed

Peter Orchard  1 Thomas W Blackwell  2   3 Linda Kachuri  4 Peter J Castaldi  5 Michael H Cho  5 Stephanie A Christenson  6 Peter Durda  7 Stacey Gabriel  8 Craig P Hersh  5   9 Scott Huntsman  10 Seungyong Hwang  11 Roby Joehanes  12   13 Mari Johnson  14 Xingnan Li  15   16 Honghuang Lin  13   17 Ching-Ti Liu  18 Yongmei Liu  19 Angel C Y Mak  10 Ani W Manichaikul  20 David Paik  21 Aabida Saferali  5 Joshua D Smith  22 Kent D Taylor  23 Russell P Tracy  24 Jiongming Wang  2 Mingqiang Wang  21 Joshua S Weinstock  25 Jeffrey Weiss  22 Heather E Wheeler  26   27 Ying Zhou  14 Sebastian Zoellner  2   3   28 Joseph C Wu  21 Luisa Mestroni  29 Sharon Graw  29 Matthew R G Taylor  29 Victor E Ortega  30 Craig W Johnson  31 Weiniu Gan  32 Goncalo Abecasis  2   3 Deborah A Nickerson  22   33 Namrata Gupta  8 Kristin Ardlie  8 Prescott G Woodruff  6   34 Yinan Zheng  35 Russell P Bowler  36 Deborah A Meyers  37 Alex Reiner  38 Charles Kooperberg  14 Elad Ziv  10   39   40   41 Vasan S Ramachandran  42   43 Martin G Larson  42 L Adrienne Cupples  42   18 Esteban G Burchard  10 Edwin K Silverman  5   9 Stephen S Rich  20 Nancy Heard-Costa  44 Hua Tang  11 Jerome I Rotter  23 Albert V Smith  2   3 Daniel Levy  12   13 NHLBI TOPMed Consortium Multi-Omics Working GroupNHLBI TOPMed ConsortiumFrançois Aguet  45 Laura Scott  2   3 Laura M Raffield  46 Stephen C J Parker  1   2   47
Affiliations

Cross-cohort analysis of expression and splicing quantitative trait loci in TOPMed

Peter Orchard et al. medRxiv. .

Abstract

Most genetic variants associated with complex traits and diseases occur in non-coding genomic regions and are hypothesized to regulate gene expression. To understand the genetics underlying gene expression variability, we characterize 14,324 ancestrally diverse RNA-sequencing samples from the NHLBI Trans-Omics for Precision Medicine (TOPMed) program and integrate whole genome sequencing data to perform cis and trans expression and splicing quantitative trait locus (cis-/trans-e/sQTL) analyses in six tissues and cell types, most notably whole blood (N=6,454) and lung (N=1,291). We show this dataset enables greater detection of secondary cis-e/sQTL signals than was achieved in previous studies, and that secondary cis-eQTL and primary trans-eQTL signal discovery is not saturated even though eGene discovery is. Most TOPMed trans-eQTL signals colocalize with cis-e/sQTL signals, suggesting many trans signals are mediated by cis signals. We fine-map European UK BioBank GWAS signals from 164 traits and colocalize the resulting 34,107 fine-mapped GWAS signals with TOPMed e/sQTL signals, finding that of 10,611 GWAS signals with a colocalization, 7,096 GWAS signals colocalize with at least one secondary e/sQTL signal. These results demonstrate that larger e/sQTL analyses will continue to uncover secondary e/sQTL signals, and that these new signals will benefit GWAS interpretation.

PubMed Disclaimer

Conflict of interest statement

PJC has received grant support from Bayer and consultant fees from Verona pharmaceuticals. MHC has received grant support from Bayer. JCW is co-founder of Greenstone Biosciences. EKS has received grant funding from Bayer. VEO previously served on independent data and monitoring committees (IDMC) for Regeneron and Sanofi. VEO receives compensation from the American Medical Association for his role as associate editor for JAMA. FA is an employee of Illumina, Inc. and is an inventor on a patent application related to TensorQTL filed by the Broad Institute. LMR is a consultant for the TOPMed Administrative Coordinating Center (through Westat). SCJP is supported by Pfizer. GRA is an employee of Regeneron Pharmaceuticals and owns stock and stock options for Regeneron Pharmaceuticals.

Figures

Figure 1.
Figure 1.
Study design. (A) RNA-seq sample sizes for cis- and trans-e/sQTL scans, by TOPMed study and tissue. The full TOPMed study names and accompanying abbreviations are: Framingham Heart Study (FHS); Gene-Environments and Admixture in Latino Asthmatics (GALA II); Study of African Americans, Asthma, Genes, & Environments (SAGE); Subpopulations and Intermediate Outcome Measures In COPD Study (SPIROMICS); Women’s Health Initiative (WHI); COPDGene Study (COPDGene); Multi-Ethnic Study of Atherosclerosis (MESA); Lung Tissue Research Consortium (LTRC). Diagram generated with SankeyMATIC. (B) cis-e/sQTL scans tested MAF ≥ 0.01 variants within 1Mb of gene TSS (for whole blood, a scan with variant MAF ≥ 0.001 was also performed). Trans scans tested variant - gene pairs on separate chromosomes (variants with MAF ≥ 0.05). Splicing phenotypes (intron excision ratios) were derived using LeafCutter (Li et al., 2018). (C) cis-e/sQTL signals were colocalized with trans-e/sQTL signals to nominate genes mediating trans effects, and cis- and trans-e/sQTL signals were colocalized with 34,107 GWAS signals from 164 UK BioBank GWAS to nominate genes and molecular mechanisms underlying GWAS signals.
Figure 2.
Figure 2.
cis-e/sQTL summary. (A) Sample sizes per tissue (top panel), number of genes with a significant cis-e/sQTL (cis-e/sGenes) and number of SuSiE credible sets discovered per cis-e/sGene (second from top), credible set sizes (third from top), and number of primary or secondary (including tertiary, quaternary, etc.) cis-e/sQTL signals per tissue (bottom). (B) Cis-eQTL saturation analysis, and comparison to other published datasets (from eQTL-Catalogue or GTEx). Number of cis-eGenes discovered at each downsampled sample size (left) and total number of cis-eQTL signals discovered (right; 95% credible sets). Results are shown for 1% FDR cis-Genes, as eQTL-Catalogue fine-maps QTL signals for 1% FDR cis-e/sGenes. (C) Functional annotation enrichments for whole blood cis-e/sQTLs. Enrichment calculated relative to control credible sets matched on MAF, LD, and number of genes tested against; error bars represent 95% confidence intervals. (D) Effect size (absolute allelic fold change) for cis-eQTL in whole blood scan with MAF 0.1% threshold.
Figure 3.
Figure 3.
trans-e/sQTL results. (A) number of trans-eGenes and trans-sGenes as a function of sample size. TOPMed tissues are linked to the corresponding GTEx and DIRECT tissues. Unlike TOPMed and GTEx, DIRECT did not apply a MAF threshold. TOPMed and GTEx defined trans as ‘different chromosome’, while DIRECT defined trans as ‘different chromosome or gene - variant pair ≥ 5Mb apart’. (B) Effect size (absolute allelic fold change) for trans-eQTL and cis-eQTL with MAF ≥ 0.05 in whole blood (cis-eQTL effect sizes from MAF ≥ 0.001 scan). (C) Number of trans-eQTL and trans-sQTL credible sets discovered when fine-mapping the region around primary trans-e/sQTL signals (+/− 1Mb). (D), (E) Two (of three total) whole blood trans-eQTL signals for gene AGAP2. One colocalizes with a RREB1 cis-sQTL, one contains a RREB1 missense variant, suggesting that both variants impact AGAP2 expression via distinct functional effects on RREB1. The credible set variants for the third AGAP2 trans-eQTL signal are in an RREB1 intron.
Figure 4.
Figure 4.
Colocalization of e/sQTL with UKBB GWAS signals. (A) Number of GWAS signals colocalizing with at least one e/sQTL credible set from each tissue and modality (cross-ancestry e/sQTL scans). (B) Heatmap displaying, for each GWAS signal with at least one e/sQTL colocalization, the maximum coloc posterior probability of colocalization for each tissue and modality. (C,D) Two IL2RA cis-eQTL signals colocalize with two albumin/globulin ratio GWAS signals. Marginal p-values are displayed in (C), and log Bayes factors for each of the two colocalizing effects, represented by the two colors, are displayed in (C); for the eQTL panel, the sign of each variant reflects the direction of effect on the gene’s expression for the GWAS trait-increasing allele in the colocalizing GWAS effect. (E,F) Three HK1 cis-eQTL signals colocalize with three mean corpuscular volume GWAS signals.

References

    1. Aguet F., Brown A. A., Castel S. E., Davis J. R., He Y., Jo B., Mohammadi P., Park Y., Parsana P., Segrè A. V., Strober B. J., Zappala Z., Cummings B. B., Gelfand E. T., Hadley K., Huang K. H., Lek M., Li X., Nedzel J. L., … Zhu J. (2017). Genetic effects on gene expression across human tissues. Nature, 550(7675), 204–213. 10.1038/nature24277 - DOI - PMC - PubMed
    1. Araujo D. S., Nguyen C., Hu X., Mikhaylova A. V., Gignoux C., Ardlie K., Taylor K. D., Durda P., Liu Y., Papanicolaou G., Cho M. H., Rich S. S., Rotter J. I., NHLBI TOPMed Consortium, Im H. K., Manichaikul A., & Wheeler H. E. (2023). Multivariate adaptive shrinkage improves cross-population transcriptome prediction and association studies in underrepresented populations. HGG Advances, 4(4), 100216. 10.1016/j.xhgg.2023.100216 - DOI - PMC - PubMed
    1. Asai Y., Eslami A., van Ginkel C. D., Akhabir L., Wan M., Ellis G., Ben-Shoshan M., Martino D., Ferreira M. A., Allen K., Mazer B., de Groot H., de Jong N. W., Gerth van Wijk R. N., Dubois A. E. J., Chin R., Cheuk S., Hoffman J., Jorgensen E., … Daley D. (2018). Genome-wide association study and meta-analysis in multiple populations identifies new loci for peanut allergy and establishes C11orf30/EMSY as a genetic risk factor for food allergy. The Journal of Allergy and Clinical Immunology, 141(3), 991–1001. 10.1016/j.jaci.2017.09.015 - DOI - PubMed
    1. Astle W. J., Elding H., Jiang T., Allen D., Ruklisa D., Mann A. L., Mead D., Bouman H., Riveros-Mckay F., Kostadima M. A., Lambourne J. J., Sivapalaratnam S., Downes K., Kundu K., Bomba L., Berentsen K., Bradley J. R., Daugherty L. C., Delaneau O., … Soranzo N. (2016). The Allelic Landscape of Human Blood Cell Trait Variation and Links to Common Complex Disease. Cell, 167(5), 1415–1429.e19. 10.1016/j.cell.2016.10.042 - DOI - PMC - PubMed
    1. Brown A. A., Fernandez-Tajes J. J., Hong M., Brorsson C. A., Koivula R. W., Davtian D., Dupuis T., Sartori A., Michalettou T.-D., Forgie I. M., Adam J., Allin K. H., Caiazzo R., Cederberg H., De Masi F., Elders P. J. M., Giordano G. N., Haid M., Hansen T., … Viñuela A. (2023). Genetic analysis of blood molecular phenotypes reveals common properties in the regulatory networks affecting complex traits. Nature Communications, 14(1), Article 1. 10.1038/s41467-023-40569-3 - DOI - PMC - PubMed

Publication types

Grants and funding

LinkOut - more resources