A flexible Bayesian model for studying gene-environment interaction
- PMID: 22291610
- PMCID: PMC3266891
- DOI: 10.1371/journal.pgen.1002482
A flexible Bayesian model for studying gene-environment interaction
Abstract
An important follow-up step after genetic markers are found to be associated with a disease outcome is a more detailed analysis investigating how the implicated gene or chromosomal region and an established environment risk factor interact to influence the disease risk. The standard approach to this study of gene-environment interaction considers one genetic marker at a time and therefore could misrepresent and underestimate the genetic contribution to the joint effect when one or more functional loci, some of which might not be genotyped, exist in the region and interact with the environment risk factor in a complex way. We develop a more global approach based on a Bayesian model that uses a latent genetic profile variable to capture all of the genetic variation in the entire targeted region and allows the environment effect to vary across different genetic profile categories. We also propose a resampling-based test derived from the developed Bayesian model for the detection of gene-environment interaction. Using data collected in the Environment and Genetics in Lung Cancer Etiology (EAGLE) study, we apply the Bayesian model to evaluate the joint effect of smoking intensity and genetic variants in the 15q25.1 region, which contains a cluster of nicotinic acetylcholine receptor genes and has been shown to be associated with both lung cancer and smoking behavior. We find evidence for gene-environment interaction (P-value = 0.016), with the smoking effect appearing to be stronger in subjects with a genetic profile associated with a higher lung cancer risk; the conventional test of gene-environment interaction based on the single-marker approach is far from significant.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
for subjects in cluster 1, with the true value given by the horizontal line in green; (b). Boxplots of posterior medians of
for subjects in cluster 2, with the true value given by the horizontal line in blue. The posterior median of
for each subject under a given simulated dataset was shifted by a constant value selected so that the median value of the shifted estimates for subjects in cluster 1 was zero.
for subjects in cluster 1, with the true value given by the horizontal line in green; (b). Boxplots of posterior medians of
for subjects in cluster 2, with the true value given by the horizontal line in blue.
for subjects in cluster 1, with the true value given by the horizontal line in green; (b). Boxplots of posterior medians of
for subjects in cluster 2, with the true value given by the horizontal line in blue; (c). Boxplots of posterior medians of
for subjects in cluster 3, with the true value given by the horizontal line in red. The posterior median of
for each subject under a given simulated dataset was shifted by a constant value selected so that the median value of the shifted estimates for subjects in cluster 1 was zero.
for subjects in cluster 1, with the true value given by the horizontal line in green; (b). Boxplots of posterior medians of
for subjects in cluster 2, with the true value given by the horizontal line in blue; (c). Boxplots of posterior medians of
for subjects in cluster 3, with the true value given by the horizontal line in red.
References
-
- Hindorff LA, Junkins HA, Hall PN, Mehta JP, Manolio TA. A catalog of published genome-wide association studies. 2011. Available at: www.genome.gov/gwastudies. Accessed August, 2011.
-
- Lindstrom S, Schumacher F, Siddiq A, Travis RC, Campa D, et al. Characterizing associations and SNP-environment interactions for GWAS-identified prostate cancer risk markers-Results from BPC3. PLoS ONE. 2011;6:e17142. doi: 10.1371/journal.pone.0017142. - DOI - PMC - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Medical
