Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug;57(8):1881-1889.
doi: 10.1038/s41588-025-02262-7. Epub 2025 Jul 21.

Improved multiancestry fine-mapping identifies cis-regulatory variants underlying molecular traits and disease risk

Affiliations

Improved multiancestry fine-mapping identifies cis-regulatory variants underlying molecular traits and disease risk

Zeyun Lu et al. Nat Genet. 2025 Aug.

Abstract

Multiancestry statistical fine-mapping of cis-molecular quantitative trait loci (cis-molQTL) aims to improve the precision of distinguishing causal cis-molQTLs from tagging variants. Here we present the sum of shared single effects (SuShiE) model, which leverages linkage disequilibrium heterogeneity to improve fine-mapping precision, infer cross-ancestry effect size correlations and estimate ancestry-specific expression prediction weights. Through extensive simulations, we find that SuShiE consistently outperforms existing methods. We apply SuShiE to 36,907 molecular phenotypes including mRNA expression and protein levels from individuals of diverse ancestries in the TOPMed-MESA and GENOA studies. SuShiE fine-maps cis-molQTLs for 18.2% more genes compared with existing methods while prioritizing fewer variants and exhibiting greater functional enrichment. While SuShiE infers highly consistent cis-molQTL architectures across ancestries, it finds evidence of heterogeneity at genes with predicted loss-of-function intolerance. Lastly, using SuShiE-derived cis-molQTL effect sizes, we perform transcriptome- and proteome-wide association studies on six white blood cell-related traits in the All of Us biobank and identify 25.4% more genes compared with existing methods. Overall, SuShiE provides new insights into the cis-genetic architecture of molecular traits.

PubMed Disclaimer

Conflict of interest statement

Competing interests: L.W. provided consulting service to Pupil Bio Inc. and reviewed manuscripts for Gastroenterology Report, not related to this study, and received honorarium. S.G. received consulting fees from Eleven Therapeutics unrelated to this work. The other authors declare no competing interests.

Figures

Figure 1:
Figure 1:. SuShiE infers PIPs, credible sets, and ancestry-specific effect sizes by leveraging shared genetic architectures and LD heterogeneity
a) SuShiE takes individual-level phenotypic and genotypic data as input and assumes the shared cis-molQTL effects as a linear combination of single effects. b) For each single shared effect, SuShiE models the cis-molQTL effect size follows a multivariate normal prior distribution with a covariance matrix, and the probability for each SNP to be moQTL follows a uniform prior distribution; through the inference, SuShiE outputs a credible set that includes putative causal cis-molQTLs, learns the effect-size covariance prior, and estimates the ancestry-specific effect sizes.
Figure 2:
Figure 2:. SuShiE outperforms other methods, estimates accurate effect-size correlation, and boosts higher power of TWAS in realistic simulations
a-c) SuShiE outputs higher posterior inclusion probabilities (PIPs; a; 0.09 on average; P=3.65e-233), smaller credible set sizes (b; 0.15 on average; P=2.02e-12), and higher frequency of cis-molQTLs in the credible sets (calibration; c; 7.6% on average; P=4.59e-126) compared to other methods. d) SuShiE accurately estimates the true effect-size correlation across ancestries while XMAP and XMAP-IND frequently produce biased estimates. e-f) SuShiE outputs higher ancestry-specific prediction accuracy (P=1.39e-144) and induces higher TWAS power (P=1.98e-285) compared against other methods with the fixed sample size. The plots are aggregation across two ancestries. By default for (a)-(f), the simulation assumes that there are 2 causal cis-molQTLs, the per-ancestry training sample size is 400, and the testing sample size is 200, cis-SNP heritability is 0.05, the effect size correlation is 0.8 across ancestries, and the proportion of cis-SNP heritability of complex trait explained by gene expression is 1.5e-4. P value is two-sided, not adjusted for multiple testing, and calculated using meta-analysis across all comparisons (Methods). The points are the mean across simulations, and the error bars are their corresponding 95% confidence intervals.
Figure 3:
Figure 3:. SuShiE reveals cis-regulatory mechanisms for mRNA and protein expression
a) SuShiE identifies cis-molQTLs for 13,818, 515, and 5,548 genes whose 89.1%, 86.8%, and 96.0% contain 1–3 cis-molQTLs for the TOPMed-MESA mRNA, TOPMed-MESA protein, and GENOA mRNA dataset, respectively. b) Posterior inclusion probabilities (PIPs) of cis-molQTLs inferred by SuShiE are mainly enriched around the TSS region of genes. We group SNPs into 500-bp-long bins and compute their PIP average. There are 2,000 bins to cover a one-million-bp-long genomic window around the genes’ TSS. c) Across all three studies, cis-molQTLs identified by SuShiE are enriched in four out of five candidate cis-regulatory elements (cCREs) from ENCODE, with the promoter (PLS) as the most enriched category. Specifically, the mRNA expression from TOPMed-MESA and GENOA shows enrichment in the promoter, proximal enhancer (pELS), CTCF, and distal enhancer (dELS) but depletion in DNase-H3K4me3. Protein abundance from TOPMed-MESA shows enrichment in PLS and pELS but non-significant enrichment in CTCF and dELS because of the low number of genes identified with pQTLs (n=515). The points are meta-analyzed log-enrichment across genes and the error bars are their corresponding 95% confidence intervals.
Figure 4:
Figure 4:. SuShiE identifies eQTL rs2528382 for URGCP with functional support
a) LD patterns of the region across EUR, AFR, and HIS. The blue color indicates LD scales (r2) the red color indicates prioritized SNP location, and the green color indicates SNPs’ LD r2 with the prioritized SNP. Manhattan plot of cis-eQTL scans of URGCP (denoted in orange) for each ancestry with SuShiE fine-mapping results. SuShiE is the only method to output credible sets for URGCP and prioritize a single SNP (rs2528382; denoted in red). b) Functional annotations at URGCP locus show colocalization of active enhancer activity and chromatin accessibility with rs2528382. H3K27ac CHIP-seq peaks are measured in PBMCs (intensity denoted in blue), 0/1 accessibility annotations determined from scATAC-seq are measured in PBMCs, and 0/1 accessibility annotations determined from snATAC-seq are measured in naive T cells, naive B cells, cytotoxic NK (cNK) cells, and monocytes. Blue rectangles denote putative cCREs called from sc/snATAC-seq data that colocalize with rs2528382 (gray no colocalization).
Figure 5:
Figure 5:. SuShiE identifies more T/PWAS genes and higher chi-square statistics compared with SuSiE and MESuSiE
a) Scatter plot of T/PWAS t-statistics comparing SuShiE (y-axis) with SuSiE and MESuSiE across all phenotypes and contributing cis-molQTL studies. SuShiE identifies 27 and 52 more T/PWAS-significant genes than SuSiE and MESuSiE, respectively. Overall, SuShiE displays higher T/PWAS chi-square statistics compared to SuSiE by 0.08 and MESuSiE by 0.01, with bootstrapped p-values of 1.43e-39 and 3.21e-2, respectively (one-sided and not adjusted for multiple testing; Methods). The black dashed line represents the identity line (y = x). Genes identified as significant by both Methods are shown in purple (Both), while those not identified by either method are shown in grey (Neither). b) Average T/PWAS chi-square statistics across all phenotypes and contributing cis-molQTL studies within low, middle, and high constraint scores for SuShiE, SuSiE, and MESuSiE (Methods). Error bars represent 95% confidence intervals.

Update of

References

    1. Cheung VG et al. Mapping determinants of human gene expression by regional and genome-wide association. Nature 437, 1365–1369 (2005). - PMC - PubMed
    1. Aguet F et al. Molecular quantitative trait loci. Nat. Rev. Methods Primers 3, 1–22 (2023).
    1. Wang G, Sarkar A, Carbonetto P & Stephens M A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. Series B Stat. Methodol 82, 1273–1300 (2020). - PMC - PubMed
    1. Wen X, Luca F & Pique-Regi R Cross-population joint analysis of eQTLs: fine mapping and functional annotation. PLoS Genet. 11, e1005176 (2015). - PMC - PubMed
    1. Kichaev G & Pasaniuc B Leveraging Functional-Annotation Data in Trans-ethnic Fine-Mapping Studies. Am. J. Hum. Genet 97, 260–271 (2015). - PMC - PubMed

MeSH terms

Grants and funding

LinkOut - more resources