Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug 1;34(15):2538-2545.
doi: 10.1093/bioinformatics/bty147.

A Bayesian framework for multiple trait colocalization from summary association statistics

Affiliations

A Bayesian framework for multiple trait colocalization from summary association statistics

Claudia Giambartolomei et al. Bioinformatics. .

Abstract

Motivation: Most genetic variants implicated in complex diseases by genome-wide association studies (GWAS) are non-coding, making it challenging to understand the causative genes involved in disease. Integrating external information such as quantitative trait locus (QTL) mapping of molecular traits (e.g. expression, methylation) is a powerful approach to identify the subset of GWAS signals explained by regulatory effects. In particular, expression QTLs (eQTLs) help pinpoint the responsible gene among the GWAS regions that harbor many genes, while methylation QTLs (mQTLs) help identify the epigenetic mechanisms that impact gene expression which in turn affect disease risk. In this work, we propose multiple-trait-coloc (moloc), a Bayesian statistical framework that integrates GWAS summary data with multiple molecular QTL data to identify regulatory effects at GWAS risk loci.

Results: We applied moloc to schizophrenia (SCZ) and eQTL/mQTL data derived from human brain tissue and identified 52 candidate genes that influence SCZ through methylation. Our method can be applied to any GWAS and relevant functional data to help prioritize disease associated genes. Availability and implementation: moloc is available for download as an R package (https://github.com/clagiamba/moloc). We also developed a web site to visualize the biological findings (icahn.mssm.edu/moloc). The browser allows searches by gene, methylation probe and scenario of interest.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Graphical representation of four possible configurations at a locus with eight SNPs in common across three traits. The traits are labeled as G, E, M representing GWAS (G), eQTL (E) and mQTL (M) datasets, respectively. Each plot represents one possible configuration, which is a possible combination of three sets of binary vectors indicating whether the variant is associated with the selected trait. Left plot top panel (GEM scenario): points to one causal variant behind all of the associations; Right plot top panel (GE scenario): represent the scenario with the same causal variant behind the GE and no association or lack of power for the M association; Left plot bottom panel (GE.M scenario): represents the case with two causal variants, one shared by the G and E and a different causal variant for M; Right plot bottom panel (G.E.M. scenario): represents the case of three distinct causal variants behind each of the datasets considered
Fig. 2.
Fig. 2.
Results from simulations under colocalization/non-colocalization scenarios (A, B), and results from real data application (C). (A) Simulations under different sample sizes for all scenarios in moloc of three traits (GWAS, eQTL and mQTL). The y axis shows the median, 10% and 90% quantile of the distribution of posterior probabilities (‘PPA’), which supports each of our scenarios of interest. Combined scenarios include gene-methylation pairs or genes that reach a posterior probability of GEM >= 80%, or + GE.M>= 80%, or GE>= 80%. All cases include 10 000 individuals in the GWAS dataset. The variance explained by the trait was set to 0.01 for GWAS (1%), and to 0.1 (10%) for the eQTL and mQTL. (B) Posterior probabilities from simulations using a sample size of 10 000 individuals for GWAS trait (denoted as G), 300 for eQTL trait (denoted as E) and 300 for mQTL trait (denoted as M). X-axis shows all 15 simulated scenarios, e.g. G.E.M, three different causal variants for each of the three traits. Y-axis shows the distribution of posterior probabilities under the simulated scenario. The height of the bar represents the mean of the PPA for each configuration across simulations. (C) Venn diagram comparing number of colocalization of two traits (coloc PPA >=80%) with three traits (moloc PPA GE + GE.M + GEM)

References

    1. Benner C. et al. (2017) Prospects of fine-mapping trait-associated genomic regions by using summary statistics from genome-wide association studies. Am. J. Hum. Genet., 101, 539–551. - PMC - PubMed
    1. Bulik-Sullivan B. et al. (2015) An atlas of genetic correlations across human diseases and traits. Nat. Genet., 47, 1236–1241. - PMC - PubMed
    1. Chung D. et al. (2014) GPA: a statistical approach to prioritizing GWAS results by integrating pleiotropy and annotation. PLoS Genet., 10, e1004787.. - PMC - PubMed
    1. Deignan J. et al. (2012) SK2 and SK3 expression differentially affect firing frequency and precision in dopamine neurons. Neuroscience, 217, 67–76. - PMC - PubMed
    1. Fromer M. et al. (2016) Gene expression elucidates functional impact of polygenic risk for schizophrenia. Nat. Neurosci., 19, 1442–1453. - PMC - PubMed

Publication types