Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan;57(1):42-52.
doi: 10.1038/s41588-024-01994-2. Epub 2025 Jan 2.

Fine-mapping causal tissues and genes at disease-associated loci

Affiliations

Fine-mapping causal tissues and genes at disease-associated loci

Benjamin J Strober et al. Nat Genet. 2025 Jan.

Abstract

Complex diseases often have distinct mechanisms spanning multiple tissues. We propose tissue-gene fine-mapping (TGFM), which infers the posterior inclusion probability (PIP) for each gene-tissue pair to mediate a disease locus by analyzing summary statistics and expression quantitative trait loci (eQTL) data; TGFM also assigns PIPs to non-mediated variants. TGFM accounts for co-regulation across genes and tissues and models uncertainty in cis-predicted expression models, enabling correct calibration. We applied TGFM to 45 UK Biobank diseases or traits using eQTL data from 38 Genotype-Tissue Expression (GTEx) tissues. TGFM identified an average of 147 PIP > 0.5 causal genetic elements per disease or trait, of which 11% were gene-tissue pairs. Causal gene-tissue pairs identified by TGFM reflected both known biology (for example, TPO-thyroid for hypothyroidism) and biologically plausible findings (for example, SLC20A2-artery aorta for diastolic blood pressure). Application of TGFM to single-cell eQTL data from nine cell types in peripheral blood mononuclear cells (PBMCs), analyzed jointly with GTEx tissues, identified 30 additional causal gene-PBMC cell type pairs.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Extended Data Figure 1:
Extended Data Figure 1:. Comparison of tissue-gene fine-mapping power at same level of FDR in simulations.
Average gene-tissue fine-mapping power (x-axis) at a specific level of FDR (y-axis) across 100 simulations for various fine-mapping methods (see legend) at eQTL sample size 100–300 (a), 300 (b), 500 (c), and 1000 (d). We note that all methods other than TGFM (cTWAS-TG, cTWAS, FOCUS-TG, FOCUS, coloc, JLIM, SMR, and SMR+HEIDI) are severely mis-calibrated, with high FDR at even the most stringent p-value or posterior probability thresholds, as evident by no method other than TGFM achieving an FDR <= 0.34 at any threshold. JLIM, SMR, and SMR+HEIDI compute p-values for each gene-tissue pair, whereas TGFM, cTWAS-TG, cTWAS, FOCUS-TG, FOCUS, and coloc calculate posterior probabilities for each gene-tissue pair. SMR corresponds to using the SMR p-value to assess the significance of a gene-tissue pair, whereas SMR+HEIDI corresponds to using the SMR p-value to assess the significance of a gene-tissue pair after filtering to gene-tissue pairs with HEIDI p-value <= 0.05. We do not visualize FDR and power of any p-value or posterior probability threshold containing fewer than 2 gene-tissue pairs in order to remove highly uncertain FDR and power estimates from the visualization
Extended Data Figure 2:
Extended Data Figure 2:. Calibration and power of tissue-gene fine-mapping for various versions of TGFM in simulations.
(a,b) Average tissue-gene fine-mapping FDR across 100 simulations for various fine-mapping methods (see legend) across eQTL sample sizes (x-axis) at PIP=0.5 (a) and PIP=0.9 (b). Single thick, dashed horizontal line denotes 1 – PIP threshold (see main text). The thin dashed horizontal lines specific to each bar denotes (1 – average PIP) (where average is taken across all genetic elements belonging to that bar; see main text). (c,d) Average tissue-gene fine-mapping power across 100 simulations for various fine-mapping methods (see legend) across eQTL sample sizes (x-axis) at PIP=0.5 (c) and PIP=0.9 (d). Error bars denote 95% confidence intervals. This supplementary figure is similar to Fig. 1, except it shows calibration and power of additional tissue-gene fine-mapping methods. “TGFM (no sampling, uniform prior)” corresponds to TGFM (Gene-Tissue) with a uniform prior and a single cis-predicted expression model (based on posterior mean causal cis-eQTL effect sizes) instead of averaging results across 100 sampled cis-predicted expression models. “TGFM (uniform prior)” corresponds to TGFM (Gene-Tissue) with a uniform prior. “TGFM” corresponds to the default version of TGFM (Gene-Tissue) (shown in Fig. 1).
Extended Data Figure 3:
Extended Data Figure 3:. Enrichment of fine-mapped TGFM genes within non-disease-specific gene sets.
(a) Enrichment of genes with TGFM (Gene) PIP > 0.5 within non-disease-specific gene sets meta-analyzed over 16 independent traits. Error bars represent 95% confidence intervals. Odds ratios and standard errors on the odds ratio were computed using logistic regression. (b) Enrichment of genes with TGFM (Gene) PIP > 0.25, 0.5, and 0.75 (see legend) within non-disease-specific gene sets meta-analyzed over 16 independent traits. Error bars represent 95% confidence intervals. Odds ratios and standard errors on the odds ratio were computed using logistic regression. Numerical results reported in Supplementary Table 14.
Extended Data Figure 4:
Extended Data Figure 4:. Comparison of TGFM (Gene) and cTWAS calibration and power using silver standard gene set of 69 known LDL cholesterol genes.
(a-b) Empirical FDR (y-axis) using silver-standard gene set of 69 known LDL cholesterol genes at PIP greater than or equal to a range of PIP thresholds (x-axis) for TGFM (Gene) (a) and cTWAS applied to GTEx liver (b). Light shading denotes 95% confidence intervals. Black dashed line denotes (1 – average PIP), a less conservative choice than (1 – PIP threshold). (c) Average gene fine-mapping power (x-axis) at a specific level of FDR (y-axis) based on silver-standard gene set of 69 known LDL cholesterol genes for TGFM (gene) and cTWAS applied to GTEx liver (see legend). PIPs for cTWAS applied to GTEx liver extracted from Supplementary Table 2 of ref. .
Extended Data Figure 5:
Extended Data Figure 5:. Properties of disease gene fine-mapping methods.
For each disease gene fine-mapping method, we report whether or not the method jointly models tissues and genes; models non-mediated variants; and models uncertainty in cis-predicted gene expression. *: FOCUS allows for modeling of non-mediated genetic effects via a single genotype intercept term shared across all variants, but this functionality is not enabled in the default version of FOCUS (and does not ameliorate mis-calibration; Supplementary Fig. 22).
Figure 1:
Figure 1:. Calibration and power of tissue-gene fine-mapping methods in simulations.
(a,b) Average gene-tissue pair fine-mapping FDR across 100 simulations for various fine-mapping methods (see legend) across eQTL sample sizes (x-axis) at PIP=0.5 (a) and PIP=0.9 (b). Dashed horizontal line denotes 1 – PIP threshold (see main text). Numerical results are reported in Supplementary Table 1. (c,d) Average gene-tissue pair fine-mapping power across 100 simulations for various fine-mapping methods (see legend) across eQTL sample sizes (x-axis) at PIP=0.5 (c) and PIP=0.9 (d). Error bars denote 95% confidence intervals based on standard error of a sample proportion. Numerical results are reported in Supplementary Table 2.
Figure 2:
Figure 2:. Calibration and power of fine-mapping different classes of genetic elements with TGFM in simulations.
(a,b) Average fine-mapping FDR across 100 simulations using TGFM for different classes of genetic elements (see legend) across eQTL sample sizes (x-axis) at PIP=0.5 (a) and PIP=0.9 (b). Dashed horizontal line denotes 1 – PIP threshold (see main text). Numerical results are reported in Supplementary Table 3. (c,d) Average fine-mapping power across 100 simulations using TGFM for different classes of genetic elements (see legend) across eQTL sample sizes (x-axis) at PIP=0.5 (c) and PIP=0.9 (d). Error bars denote 95% confidence intervals based on standard error of a sample proportion. Numerical results are reported in Supplementary Table 4.
Figure 3:
Figure 3:. Summary results of fine-mapping genetic elements with TGFM for 16 independent UK Biobank diseases and traits.
We report the number of (a) Gene-tissue pairs, (b) Genes, and (c) (non-mediated) Variants fine-mapped using TGFM (y-axis; square root scale) across 16 independent UK Biobank traits (x-axis) at various PIP thresholds ranging from 0.2 to 1.0 (color-bars). Horizontal black lines denote the number of genetic elements fine-mapped at PIP=0.5. FEV1:FVC, ratio of forced expiratory volume in 1 second to forced vital capacity; Platelet volume, Mean platelet volume; Diastolic BP, Diastolic blood pressure; Reticulocyte count, High-light scatter reticulocyte count; Corp. hemoglobin, Mean corpuscular hemoglobin; FVC, Forced vital capacity. Results for all 45 UK Biobank diseases and traits are reported in Supplementary Fig. 29, and numerical results are reported in Supplementary Table 8.
Figure 4:
Figure 4:. Properties of fine-mapped tissues and genes.
(a) Proportion of fine-mapped gene-tissue pairs in each tissue (x-axis) for 14 representative traits (y-axis). Proportions for each trait were calculated by counting the number of gene-tissue pairs with TGFM PIP > 0.5 in each tissue and normalizing the counts across tissues. Tissues are only displayed if their proportion is > 0.2 for at least one of the 14 representative traits. Asterisks denote statistical significance (FDR ≤ 0.05 via the TGFM tissue-specific prior; see Methods) of each tissue-trait pair. Results for all remaining traits and tissues are reported in Supplementary Fig. 30, and numerical results are reported in Supplementary Table 9. The 14 representative traits were selected by including 12 of the 16 independent traits (Fig. 3) with many high PIP gene-tissue pairs and two additional, interesting traits (All autoimmune and Vitamin D level). (b) Proportion of stratified tissue-trait pairs reported as statistically significant in S-LDSC analyses using chromatin data (y-axis) as a function of S-LDSC significance thresholds (x-axis), across all 45 traits analyzed; tissue-trait pairs are stratified according to significance (FDR ≤ 0.05 or FDR > 0.05) via the TGFM tissue-specific prior. Results at alternative TGFM tissue-specific prior significance thresholds are reported in Supplementary Fig. 33, and numerical results are reported in Supplementary Table 12. (c) Top panel shows average PoPS score (y-axis) of genes stratified by TGFM (Gene) PIP (x-axis). Averages were computed across genes for the 16 independent traits listed in Fig. 3, as both PoPS score and TGFM gene PIPs are trait-specific. Error bars denote 95% confidence intervals based on standard error of a sample mean. Bottom panel shows the distribution of PoPS scores (y-axis) for genes stratified by TGFM (Gene) PIP (x-axis). Distributions were computed across genes for the 16 independent traits listed in Fig. 3, as both PoPS score and TGFM gene PIPs are trait-specific. There exist 83821 total gene-trait pairs: 26797 in TGFM bin 0PIP<0.01, 54328 in TGFM bin 0.01PIP<0.25, 1879 in TGFM bin 0.25PIP<0.5, 442 in TGFM bin 0.5PIP<0.7, 250 in TGFM bin 0.7PIP<0.9, and 125 in TGFM bin 0.9PIP<1. Numerical results are reported in Supplementary Table 13. (d) Empirical FDR when distinguishing a silver-standard gene set of 69 known LDL cholesterol genes analyzed in Figure 4 of ref. from nearby genes (y-axis), for TGFM (Gene) PIP for LDL cholesterol greater than or equal to a range of PIP thresholds (x-axis). Light green shading denotes 95% confidence intervals. Numerical results are reported in Supplementary Table 15.
Figure 5:
Figure 5:. Robustness of TGFM results in analyses of alternative eQTL data sets.
(a) Results of tissue-ablation analysis of 115 gene-tissue pairs for 18 representative traits that were prioritized by TGFM (PIP > 0.5) in the primary analysis. We report the number of loci in the tissue-ablation analysis with no gene-tissue pair prioritized by TGFM (PIP > 0.5); a gene-tissue pair prioritized by TGFM (PIP > 0.5) corresponding to the same gene and the best proxy tissue (see Methods); a gene-tissue pair prioritized by TGFM (PIP > 0.5) corresponding to the same gene and a non-proxy tissue; or a gene-tissue pair prioritized by TGFM (PIP > 0.5) corresponding to a different gene. Results at alternative PIP thresholds are reported in Supplementary Fig. 35, and numerical results are reported in Supplementary Table 16. (b) Results of replacing GTEx whole blood (N=320) with pseudobulk PBMC (N=113) for 62 gene-trait pairs for 18 representative traits that TGFM fine-mapped for GTEx whole blood (PIP > 0.5) in the primary analysis. The vertical red line denotes the average PIP in PBMC, and the histogram summarizes the average PIP of each GTEx tissue (excluding whole blood). Numerical results are reported in Supplementary Table 17. The 18 representative traits consist of the 16 independent traits (Fig. 3) and two additional, interesting traits (All autoimmune and Vitamin D levels).
Figure 6:
Figure 6:. Examples of fine-mapped gene-tissue-disease triplets identified by TGFM.
We report 6 example loci for which TGFM fine-mapped a gene-tissue pair (PIP > 0.5). In each example we report the marginal GWAS and TWAS association −log10 p-values (y-axis) of non-mediated variants (blue circles) and gene-tissue pairs (red triangles). Marginal TWAS association −log10 p-values were calculated by taking the median −log10 TWAS p-value across the 100 sets of sampled cis-predicted expression models for each gene-tissue pair. See methods section for description of how GWAS association statistics were calculated. The genomic position of each gene-tissue pair (x-axis) was based on the gene’s TSS. The color shading of each variant and gene-tissue pair was determined by its TGFM PIP. Any genetic element with TGFM PIP > 0.5 was made larger in size. Dashed horizontal blue and red lines represent GWAS significance (5 × 10−8) and TWAS significance (4.2 × 10−7) thresholds, respectively. Numerical results are reported in Supplementary Table 18.
Figure 7:
Figure 7:. Summary results of fine-mapping gene-PBMC cell type pairs with TGFM for 18 representative UK Biobank diseases and traits.
(a-b) Number of gene-PBMC cell type pairs fine-mapped using TGFM (y-axis; square root scale) across 18 representative UK Biobank traits (x-axis) at various PIP thresholds ranging from 0.2 to 1.0 (color-bar), distinguishing between (a) autoimmune diseases and blood cell traits and (b) non-blood-related traits. Horizontal black lines denote the number of gene-PBMC cell type pairs fine-mapped at PIP=0.5. The 18 representative traits (same as Fig. 5) consist of the 16 independent traits (Fig. 3) and two additional, interesting traits (All autoimmune and Vitamin D levels). Results for all 45 UK Biobank diseases and traits are reported in Supplementary Fig. 40. (c-e) Number of gene-PBMC cell type pairs fine-mapped using TGFM (y-axis; square root scale) in each of the 9 PBMC cell types (x-axis) at various PIP thresholds ranging from 0.2 to 1.0 (color-bar) for (c) Monocyte count, (d) Lymphocyte count, and (e) All autoimmune disease. Horizontal black lines denote the number of gene-PBMC cell type pairs fine-mapped at PIP=0.5. Asterisks denote statistical significance (FDR ≤ 0.05 via the TGFM tissue-specific prior; see Methods) of each PBMC cell type-trait pair. Results for all 45 UK Biobank diseases and traits are reported in Supplementary Fig. 41. Numerical results are reported in Supplementary Table 20.
Figure 8:
Figure 8:. Examples of fine-mapped gene-PBMC cell type-disease triplets identified by TGFM.
We report 4 example loci for which TGFM fine-maps a gene-PBMC cell type pair (PIP > 0.5). In each example we report the marginal GWAS and TWAS association −log10 p-values (y-axis) of non-mediated variants (blue circles) and gene-tissue (or gene-PBMC cell type) pairs (red triangles). Marginal TWAS association −log10 p-values were calculated by taking the median −log10 TWAS p-value across the 100 sets of sampled cis-predicted expression models for each gene-tissue (or gene-PBMC cell type) pair. See methods section for description of how GWAS association statistics were calculated. The genomic position of each gene-tissue (or gene-PBMC cell type) pair (x-axis) was based on the gene’s TSS. The color shading of each variant and gene-tissue (or gene-PBMC cell type) pair was determined by its TGFM PIP. Any genetic element with TGFM PIP > 0.5 was made larger in size. Dashed horizontal blue and red lines represent GWAS significance (5 × 10−8) and TWAS significance (4.1 × 10−7) thresholds, respectively. Numerical results are reported in Supplementary Table 22.

Update of

References

    1. Hekselman I & Yeger-Lotem E Mechanisms of tissue and cell-type specificity in heritable traits and diseases. Nat. Rev. Genet. 21, 137–150 (2020). - PubMed
    1. Trynka G et al. Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat. Genet. 45, 124–130 (2013). - PMC - PubMed
    1. Finucane HK et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015). - PMC - PubMed
    1. Kundaje A et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015). - PMC - PubMed
    1. Ongen H et al. Estimating the causal tissues for complex traits and diseases. Nat. Genet. 49, 1676–1683 (2017). - PubMed

Methods-only references

    1. Strober BJ TGFM software. Zenodo. 10.5281/zenodo.13823621 (2024). - DOI
    1. Marriott P, Efron B & Tibshirani RJ An introduction to the bootstrap. J. R. Stat. Soc. Ser. A Stat. Soc. 158, 347 (1995).
    1. Benjamini Y & Hochberg Y Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. 57, 289–300 (1995).
    1. O’Connor LJ et al. Extreme polygenicity of complex traits is explained by negative selection. Am. J. Hum. Genet. 105, 456–476 (2019). - PMC - PubMed
    1. Gazal S et al. Linkage disequilibrium–dependent architecture of human complex traits shows action of negative selection. Nat. Genet. 49, 1421–1427 (2017). - PMC - PubMed

MeSH terms

LinkOut - more resources