Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug 29;9(9):435.
doi: 10.3390/genes9090435.

FDHE-IW: A Fast Approach for Detecting High-Order Epistasis in Genome-Wide Case-Control Studies

Affiliations

FDHE-IW: A Fast Approach for Detecting High-Order Epistasis in Genome-Wide Case-Control Studies

Shouheng Tuo. Genes (Basel). .

Abstract

Detecting high-order epistasis in genome-wide association studies (GWASs) is of importance when characterizing complex human diseases. However, the enormous numbers of possible single-nucleotide polymorphism (SNP) combinations and the diversity among diseases presents a significant computational challenge. Herein, a fast method for detecting high-order epistasis based on an interaction weight (FDHE-IW) method is evaluated in the detection of SNP combinations associated with disease. First, the symmetrical uncertainty (SU) value for each SNP is calculated. Then, the top-k SNPs are isolated as guiders to identify 2-way SNP combinations with significant interaction weight values. Next, a forward search is employed to detect high-order SNP combinations with significant interaction weight values as candidates. Finally, the findings were statistically evaluated using a G-test to isolate true positives. The developed algorithm was used to evaluate 12 simulated datasets and an age-related macular degeneration (AMD) dataset and was shown to perform robustly in the detection of some high-order disease-causing models.

Keywords: Single-nucleotide polymorphism; high-order epistasis; interaction weight.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
An example of the fast approach for detecting high-order epistasis with interaction weight (FDHE-IW) approach for detecting 3-way single nucleotide polymorphism (SNP) combinations associated with a given phenotype. (1) Using the dataset with N SNPs, the SU value of each SNP was calculated. (2) The top three SNPs with the largest SU values were selected from the N SNPs, and then the three SNPs were employed as seeds to calculate the 2-way interaction weights (IW) with the other SNPs. (3) The top nine 2-way SNP-combinations were selected from the 2-way SNP combinations that pair with the three parent SNPs from (2), based on IW values. (4) The top 9 × 3 = 27 3-way SNP-combinations that are formed from a parent SNP-combination in (3) were selected based on IW. (5) The G-test statistical method was employed to test the 27 3-way SNP combinations, and two 3-way SNP combinations were verified using the G-test.
Figure 2
Figure 2
Detection powers of the five evaluated algorithms (100 SNPs, 1600 sample size).DME: Disease loci with marginal effects.
Figure 3
Figure 3
Detection powers of the five evaluated algorithms (1000 SNPs, 4000 sample size).
Figure 4
Figure 4
2-way SNP and a representative gene network. (a) There are 35 edges and 45 nodes in Figure 4a, where each node denotes a SNP locus. An edge represents a 2-way SNP combination that has a strong association with the phenotype. The yellow SNPs (nodes) have been reported to be associated with age-related macular degeneration (AMD). (b) In Figure 4b, the nodes and edges are mapped from nodes and edges from Figure 4a, in which a node denotes a gene, and each edge represents a 2-way SNP combination that is mapped to two genes. NA denotes non-gene coding regions; there are multiple NA–NA edges because multiple SNP–SNP pairs were mapped to non-gene-coding regions. The greater the number of edges between two gene nodes, the more the SNP combination maps into the two genes. The yellow genes (nodes) are believed to be associated with AMD. In node NA, there are many edges, which means there are multiple SNP combinations in the non-coding region.

Similar articles

Cited by

References

    1. Manolio T.A. Genomewide association studies and assessment of the risk of disease. N. Engl. J. Med. 2010;363:166–176. doi: 10.1056/NEJMra0905980. - DOI - PubMed
    1. Klein R.J., Zeiss C., Chew E.Y., Tsai J.Y., Sackler R.S., Haynes C., Henning A.K., SanGiovanni J.P., Mane S.M., Mayne S.T., et al. Complement factor H polymorphism in age-related macular degeneration. Science. 2005;308:385–389. doi: 10.1126/science.1109557. - DOI - PMC - PubMed
    1. Upton A., Trelles O., Cornejo-García J.A., Perkins J.R. Review: High-performance computing to detect epistasis in genome scale data sets. Brief. Bioinform. 2016;17:368–379. doi: 10.1093/bib/bbv058. - DOI - PubMed
    1. Jiang R. Gene-gene interaction. In: Gellman M.D., Turner J.R., editors. Encyclopedia of Behavioral Medicine. Springer; New York, NY, USA: 2013. pp. 841–842.
    1. Stanfill A.G., Starlarddavenport A. Primer in Genetics and Genomics, Article 7-Multifactorial Concepts: Gene-Gene Interactions. Biol. Res. Nurs. 2018;20:359–364. doi: 10.1177/1099800418761098. - DOI - PubMed

LinkOut - more resources