Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec 26;41(1):btaf017.
doi: 10.1093/bioinformatics/btaf017.

Funmap: integrating high-dimensional functional annotations to improve fine-mapping

Affiliations

Funmap: integrating high-dimensional functional annotations to improve fine-mapping

Yuekai Li et al. Bioinformatics. .

Abstract

Motivation: Fine-mapping aims to prioritize causal variants underlying complex traits by accounting for the linkage disequilibrium of genome-wide association study risk locus. The expanding resources of functional annotations serve as auxiliary evidence to improve the power of fine-mapping. However, existing fine-mapping methods tend to generate many false positive results when integrating a large number of annotations.

Results: In this study, we propose a unified method to integrate high-dimensional functional annotations with fine-mapping (Funmap). Funmap can effectively improve the power of fine-mapping by borrowing information from hundreds of functional annotations. Meanwhile, it relates the annotation to the causal probability with a random effects model that avoids the over-fitting issue, thereby producing a well-controlled false positive rate. Paired with a fast algorithm, Funmap enables scalable integration of a large number of annotations to facilitate prioritizing multiple causal single nucleotide polymorphisms. Our comprehensive simulations across a wide range of annotation relevance settings demonstrate that Funmap is the only method that produces well-calibrated false discovery rate under the setting of high-dimensional annotations while achieving better or comparable power gains as compared to existing methods. By integrating genome-wide association studies of 4 lipid traits with 187 functional annotations, Funmap consistently identified more variants that can be replicated in an independent cohort, achieving 15.5%-26.2% improvement over the runner-up in terms of replication rate.

Availability and implementation: The Funmap software and all analysis code are available at https://github.com/LeeHITsz/Funmap.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Comparison of FDR control in simulation studies. (a, b) Calibration of FDR with n=50000,m=100, while the number of causal SNPs is set to L0=2 (a) and L0=3 (b). Results are summarized from 500 replications across 10 regions. (c). An illustrative example generated by simulation. The first column shows the absolute correlation among the two candidate causal SNPs and their neighboring SNPs and the Manhattan plot. The second to fourth columns show the PIP obtained by with compared methods. Red dots represent causal SNPs. Dots with the same color of outline represent SNPs in the level-95% credible sets of a causal signal.
Figure 2.
Figure 2.
Comparison of statistical power in simulation studies. (a, b) Statistical power of compared methods with n=50000,m=100 while the number of causal SNPs is set to L0=2 (a) and L0=3 (b). (c) An illustrative example generated by simulation.
Figure 3.
Figure 3.
Comparison of PIP and CPU timings. (a, b). Comparison of PIP between Funmap and SuSiE (left panel), CARMA+anno (middle panel), and PAINTOR+anno (right panel) with n=50000,m=100, while L0 is varied at 2 (a) and 3 (b). (c) CPU timings are shown for increasing p with m=100 (left panel) and increasing m with p=1833. (d) Boxplot displays the size of the 95% credible sets from the simulation results with n=50000,m=100,L0{2,3}.
Figure 4.
Figure 4.
Replication analysis of Funmap, CARMA+anno, and PAINTOR+anno. Bar charts on the top shows the fraction and number of newly identified SNPs with P-value <5×108 in the replication cohorts of GLGC GWAS. Bar charts on the bottom shows the fraction and number of newly identified SNPs that are included in the 95%-level credible sets generated from GLGC GWAS with SuSiE.
Figure 5.
Figure 5.
Comparison of credible set size and fine-mapping results from a region of TC GWAS. (a) Box plots of credible set size across four lipid traits. (b) Fine-mapping results of TC from locus 6 Mb–7 Mb in chromosome 8. The first column shows the heatmap of absolute correlation between rs2928617 and its neighboring SNPs and the Manhattan plot. The red dashed line represents 5×108. The second to fourth column show the PIP obtained by with compared methods. The purple square represents SNP rs2928617 and the color of the points represents the correlation between neighboring SNPs and rs2928617. Dots with the same color of outline represent SNPs in the level-95% credible sets of a causal signal.
Figure 6.
Figure 6.
Box plot for Funmap annotation importance scores across 864 genomic regions of four lipid traits (190–374 regions per trait).

Similar articles

Cited by

References

    1. Benner C, Spencer CC, Havulinna AS. et al. Finemap: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 2016;32:1493–501. - PMC - PubMed
    1. Bouchard G. Efficient bounds for the softmax function and applications to approximate inference in hybrid models. In: NIPS 2007 Workshop for Approximate Bayesian Inference in Continuous/Hybrid Systems, Vancouver, Vol. 6. 2007.
    1. Bycroft C, Freeman C, Petkova D. et al. The UK biobank resource with deep phenotyping and genomic data. Nature 2018;562:203–9. - PMC - PubMed
    1. Cai M, Xiao J, Zhang S. et al. A unified framework for cross-population trait prediction by leveraging the genetic correlation of polygenic traits. Am J Hum Genet 2021;108:632–55. - PMC - PubMed
    1. Carbonetto P, Stephens M.. Scalable variational inference for Bayesian variable selection in regression, and its accuracy in genetic association studies. Bayesian Anal 2012;7:73–108.