. 2014 Mar 26;9(3):e92310.

doi: 10.1371/journal.pone.0092310. eCollection 2014.

Discovering pair-wise genetic interactions: an information theory-based approach

Tomasz M Ignac¹, Alexander Skupin², Nikita A Sakhanenko³, David J Galas¹

Affiliations

¹ Luxembourg Centre for Systems Biomedicine, Esch-sur-Alzette, Luxembourg; Pacific Northwest Diabetes Research Institute, Seattle, Washington, United States of America.
² Luxembourg Centre for Systems Biomedicine, Esch-sur-Alzette, Luxembourg; National Center for Microscopy and Imaging Research, University of California San Diego, La Jolla, California, United States of America.
³ Pacific Northwest Diabetes Research Institute, Seattle, Washington, United States of America.

PMID: 24670935
PMCID: PMC3966778
DOI: 10.1371/journal.pone.0092310

Discovering pair-wise genetic interactions: an information theory-based approach

Tomasz M Ignac et al. PLoS One. 2014.

. 2014 Mar 26;9(3):e92310.

doi: 10.1371/journal.pone.0092310. eCollection 2014.

Authors

Tomasz M Ignac¹, Alexander Skupin², Nikita A Sakhanenko³, David J Galas¹

Affiliations

¹ Luxembourg Centre for Systems Biomedicine, Esch-sur-Alzette, Luxembourg; Pacific Northwest Diabetes Research Institute, Seattle, Washington, United States of America.
² Luxembourg Centre for Systems Biomedicine, Esch-sur-Alzette, Luxembourg; National Center for Microscopy and Imaging Research, University of California San Diego, La Jolla, California, United States of America.
³ Pacific Northwest Diabetes Research Institute, Seattle, Washington, United States of America.

PMID: 24670935
PMCID: PMC3966778
DOI: 10.1371/journal.pone.0092310

Abstract

Phenotypic variation, including that which underlies health and disease in humans, results in part from multiple interactions among both genetic variation and environmental factors. While diseases or phenotypes caused by single gene variants can be identified by established association methods and family-based approaches, complex phenotypic traits resulting from multi-gene interactions remain very difficult to characterize. Here we describe a new method based on information theory, and demonstrate how it improves on previous approaches to identifying genetic interactions, including both synthetic and modifier kinds of interactions. We apply our measure, called interaction distance, to previously analyzed data sets of yeast sporulation efficiency, lipid related mouse data and several human disease models to characterize the method. We show how the interaction distance can reveal novel gene interaction candidates in experimental and simulated data sets, and outperforms other measures in several circumstances. The method also allows us to optimize case/control sample composition for clinical studies.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

**Figure 1. Genetic interactions in yeast.**
A) Conditional entropy of the phenotype given a single marker, . B) A heat map of conditional entropy of the phenotype given two markers. Notice stripes caused by the markers with strongest single effects that make detection of pairs with small effect difficult, especially for a large number of markers. C) Interaction distance between marker 7.9, which is the marker with the strongest marginal effect, and every other marker. The “negative peak” shows that neighborhood markers contain redundant information. Most values fluctuates around zero since they do not interact with 7.9. D) A heat map of interaction distance for all pairs of markers and the phenotype, . Note that there are no stripes anymore.

formula image — **Figure 1. Genetic interactions in yeast.**
A) Conditional entropy of the phenotype given a single marker, . B) A heat map of conditional entropy of the phenotype given two markers. Notice stripes caused by the markers with strongest single effects that make detection of pairs with small effect difficult, especially for a large number of markers. C) Interaction distance between marker 7.9, which is the marker with the strongest marginal effect, and every other marker. The “negative peak” shows that neighborhood markers contain redundant information. Most values fluctuates around zero since they do not interact with 7.9. D) A heat map of interaction distance for all pairs of markers and the phenotype, . Note that there are no stripes anymore.

**Figure 2. Genetic interactions in body weight phenotype of mouse.**
The first row, panels (A–C), shows results of the genetic analysis of the body weight for the full cohort of 303 mice (no sex division). Panel (A) shows conditional entropy , (B) shows and (C) shows the interaction distance. The strongest effect on the phenotype in this case comes from markers located on the Y chromosome present only in males. This is expected since the weight is strongly correlated with the sex. Rows (D–F) and (G–I) show data for female and male subpopulations respectively. A comparison of (D) and (G) reveals sex specific QTLs affecting the phenotype. Panels (B), (E) and (H) exhibit the characteristic stripe pattern, which masks the more subtle synthetic and modifying interactions. Finally, (C), (F) and (I) plot the ID scores for all pairs of markers. The red spots in panels (C, F, I) and blue/yellow spots in panels (B, E, H) point out potentially interesting pairs, which are a subject of further investigation.

**Figure 3. Detection of SNP interactions in a human disease model.**
Each main panel shows interaction distance (red) and interaction information (grey) computed on simulated data from a human disease model M15 of the interaction between SNPs A and B defined in the table. Solid lines describe analytical expectation values, dots show average values obtained from simulations and shadowed bands describe corresponding standard deviations (see Methods for more details). The upper sub-panels show the conditional entropy of phenotype given SNP A (blue) and B (cyan), respectively. The entropy illustrates strength of marginal effect of a given SNP Minor allele frequencies of SNP B were fixed to = 1% in panel (A) and 2.5% in panel (B) Lower sub-panels show the effect of changing the value of . More precisely, the lower sub-panel on the left shows expected values of the interaction distance, and on the right – of the interaction information as functions of different values of and .

**Figure 4. Detection of SNP interactions in further disease models.**
Additional simulations showing performance of the ID (red) and II (grey) for various models. To mimic a scenario in which the disease can be caused by other factors (e.g., other mutations, environmental factors) we added noise to some of the models, which take form of non-binary penetrance tables.

**Figure 5. Relationship between the case-to-control ratio and ID.**
Among other factors, optimal detection of synergistic effects depends on the case-to-control ratio f of the study. Panels (A,B) show the dependency of ID values on allele frequencies and ratio f for models M15 (A) and M84 (B).

**Figure 6. Distribution of mice phenotype.**
Distribution of mice weight exhibits clear sex dependence. Male mice (blue) are significantly heavier than female (red). The merged unisexual distribution (magenta) exhibits larger variation.

See this image and copyright information in PMC

References

1. Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, et al. (2009) Finding the missing heritability of complex diseases. Nature 461: 747–753. - PMC - PubMed
1. Brem RB, Storey JD, Whittle J, Kruglyak L (2005) Genetic interactions between polymorphisms that affect gene expression in yeast. Nature 436: 701–703. - PMC - PubMed
1. Drees BL, Thorsson V, Carter GW, Rives AW, Raymond MZ, et al. (2005) Derivation of genetic interaction networks from quantitative phenotype data. Genome Biol 6: R38. Available: http://genomebiology.com/2005/6/4/R38. Accessed 19 July 2013. - PMC - PubMed
1. Carter GW, Prinz S, Neou C, Shelby JP, Marzolf B, et al. (2007) Prediction of phenotype and gene expression for combinations of mutations. Mol Syst Biol 3: 96. Available: http://onlinelibrary.wiley.com/doi/10.1038/msb4100137/full. Accessed 19 July 2013. - DOI - PMC - PubMed
1. Carter GW, Dudley AM (2011) Systems Genetics and Complex Traits. In: Meyers RA, editor. Encyclopedia of Complexity and Systems Science. Springer. 9105–9124.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Discovering pair-wise genetic interactions: an information theory-based approach

Affiliations

Discovering pair-wise genetic interactions: an information theory-based approach

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases