Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar;43(2):150-165.
doi: 10.1002/gepi.22171. Epub 2018 Nov 19.

Using Bayes model averaging to leverage both gene main effects and G × E interactions to identify genomic regions in genome-wide association studies

Affiliations

Using Bayes model averaging to leverage both gene main effects and G × E interactions to identify genomic regions in genome-wide association studies

Lilit C Moss et al. Genet Epidemiol. 2019 Mar.

Abstract

Genome-wide association studies typically search for marginal associations between a single-nucleotide polymorphism (SNP) and a disease trait while gene-environment (G × E) interactions remain generally unexplored. More powerful methods beyond the simple case-control (CC) approach leverage either marginal effects or CC ascertainment to increase power. However, these potential gains depend on assumptions whose aptness is often unclear a priori. Here, we review G × E methods and use simulations to highlight performance as a function of main and interaction effects and the association of the two factors in the source population. Substantial variation in performance between methods leads to uncertainty as to which approach is most appropriate for any given analysis. We present a framework that (a) balances the robustness of a CC approach with the power of the case-only (CO) approach; (b) incorporates main SNP effects; (c) allows for incorporation of prior information; and (d) allows the data to determine the most appropriate model. Our framework is based on Bayes model averaging, which provides a principled statistical method for incorporating model uncertainty. We average over inclusion of parameters corresponding to the main and G × E interaction effects and the G-E association in controls. The resulting method exploits the joint evidence for main and interaction effects while gaining power from a CO equivalent analysis. Through simulations, we demonstrate that our approach detects SNPs within a wide range of scenarios with increased power over current methods. We illustrate the approach on a gene-environment scan in the USC Children's Health Study.

Keywords: bayesian model; case-control studies; environmental factor; genome-wide scan; power.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Heatmaps depicting power patterns for detection of GxE interaction across a marginal G and GxE interaction effect range r = [−1.0, +1.0] for one-step methods on 1,000 simulations of 500 cases and 500 controls. Within each heatmap plot in the grid, the x-axis shows the simulated marginal G effect with the null indicated by a vertical line. The y-axis is the simulated GxE effect with the null indicated by a horizontal line. The grid columns of Figure 1 represent the simulated G-E association in the population.
Figure 2
Figure 2
Empirical Power measured across a range r = [−1.0, +1.0] of G-E association with and without a GxE interaction and marginal effect for CO DF2 and BMA DF2 approaches. BMA(100:1) and BMA (1:100) represent an analysis of BMA DF2 with prior weighting based on a 100:1 and 1:100 odds of a CC model being more appropriate than a CO model respectively. A) OR(GxE)=1.0 & OR(G)=1.0, B) OR(GxE)=1.5 & OR(G)=1.0, C) OR(GxE)=1.0 & OR(G)=1.2, D) OR(GxE)=1.5 & OR(G)=1.2.
Figure 3
Figure 3
Empirical power vs. OR(GxE) with independence between G and E (plots A-C). Based on genome-wide simulations of 1 million SNPs with 1000 repetitions and one designated causal SNP in each repetition. A) OR(G) = 1.0 & OR(E) = 1.0; B) OR(G) = 1.2 & OR(E) = 1.2; C) Both OR(G) and OR(E) are induced by the interaction effect and are not held constant.
Figure 4
Figure 4
Receiver operating characteristic (ROC) curves for True and False positives in simulations of 1000 repetitions of 10,000 SNPs. A) 20 SNPs with non-zero GxE interaction (causal), no presence of non-causal SNPs associated with E, presence of marginal effect of causal SNPs. B) 20 SNPs with non-zero GxE interaction (causal), 500 non-causal SNPs associated with E, presence of marginal effect of causal SNPs. C) 20 SNPs with non-zero GxE interaction (causal), 500 non-causal SNPs associated with E, no marginal effect of causal SNPs.

Similar articles

Cited by

References

    1. Agresti A (2002). Loglinear Models for Contingency Tables Categorical Data Analysis (2 ed., pp. 314–356). New Jersey: John Wiley & Sons.
    1. Bishop YMM, Fienberg SE, & Holland PW (1975). Discrete multivariate analysis : theory and practice. Cambridge, Mass. ; London: M.I.T. Press.
    1. Dai JY, Logsdon BA, Huang Y, Hsu L, Reiner AP, Prentice RL, & Kooperberg C (2012). Simultaneously testing for marginal genetic association and gene-environment interaction. Am J Epidemiol, 176(2), 164–173. doi: 10.1093/aje/kwr521 - DOI - PMC - PubMed
    1. Gauderman WJ, Zhang P, Morrison JL, & Lewinger JP (2013). Finding novel genes by testing G x E interactions in a genome-wide association study. Genet Epidemiol , 37(6), 603–613. doi: 10.1002/gepi.21748 - DOI - PMC - PubMed
    1. Hoeting JA, Madigan D, Raftery AE, & Volinsky CT (1999). Bayesian model averaging: A tutorial. Statistical Science, 14(4), 382–401.

Publication types

Substances

LinkOut - more resources