An efficient algorithm to perform multiple testing in epistasis screening
- PMID: 23617239
- PMCID: PMC3648350
- DOI: 10.1186/1471-2105-14-138
An efficient algorithm to perform multiple testing in epistasis screening
Abstract
Background: Research in epistasis or gene-gene interaction detection for human complex traits has grown over the last few years. It has been marked by promising methodological developments, improved translation efforts of statistical epistasis to biological epistasis and attempts to integrate different omics information sources into the epistasis screening to enhance power. The quest for gene-gene interactions poses severe multiple-testing problems. In this context, the maxT algorithm is one technique to control the false-positive rate. However, the memory needed by this algorithm rises linearly with the amount of hypothesis tests. Gene-gene interaction studies will require a memory proportional to the squared number of SNPs. A genome-wide epistasis search would therefore require terabytes of memory. Hence, cache problems are likely to occur, increasing the computation time. In this work we present a new version of maxT, requiring an amount of memory independent from the number of genetic effects to be investigated. This algorithm was implemented in C++ in our epistasis screening software MBMDR-3.0.3. We evaluate the new implementation in terms of memory efficiency and speed using simulated data. The software is illustrated on real-life data for Crohn's disease.
Results: In the case of a binary (affected/unaffected) trait, the parallel workflow of MBMDR-3.0.3 analyzes all gene-gene interactions with a dataset of 100,000 SNPs typed on 1000 individuals within 4 days and 9 hours, using 999 permutations of the trait to assess statistical significance, on a cluster composed of 10 blades, containing each four Quad-Core AMD Opteron(tm) Processor 2352 2.1 GHz. In the case of a continuous trait, a similar run takes 9 days. Our program found 14 SNP-SNP interactions with a multiple-testing corrected p-value of less than 0.05 on real-life Crohn's disease (CD) data.
Conclusions: Our software is the first implementation of the MB-MDR methodology able to solve large-scale SNP-SNP interactions problems within a few days, without using much memory, while adequately controlling the type I error rates. A new implementation to reach genome-wide epistasis screening is under construction. In the context of Crohn's disease, MBMDR-3.0.3 could identify epistasis involving regions that are well known in the field and could be explained from a biological point of view. This demonstrates the power of our software to find relevant phenotype-genotype higher-order associations.
Figures





Similar articles
-
gammaMAXT: a fast multiple-testing correction algorithm.BioData Min. 2015 Nov 20;8:36. doi: 10.1186/s13040-015-0069-x. eCollection 2015. BioData Min. 2015. PMID: 26594243 Free PMC article.
-
High-throughput analysis of epistasis in genome-wide association studies with BiForce.Bioinformatics. 2012 Aug 1;28(15):1957-64. doi: 10.1093/bioinformatics/bts304. Epub 2012 May 21. Bioinformatics. 2012. PMID: 22618535 Free PMC article.
-
Enabling personal genomics with an explicit test of epistasis.Pac Symp Biocomput. 2010:327-36. doi: 10.1142/9789814295291_0035. Pac Symp Biocomput. 2010. PMID: 19908385 Free PMC article.
-
Review on GPU accelerated methods for genome-wide SNP-SNP interactions.Mol Genet Genomics. 2024 Dec 29;300(1):10. doi: 10.1007/s00438-024-02214-6. Mol Genet Genomics. 2024. PMID: 39738695 Review.
-
Travelling the world of gene-gene interactions.Brief Bioinform. 2012 Jan;13(1):1-19. doi: 10.1093/bib/bbr012. Epub 2011 Mar 26. Brief Bioinform. 2012. PMID: 21441561 Review.
Cited by
-
Functional Regression Models for Epistasis Analysis of Multiple Quantitative Traits.PLoS Genet. 2016 Apr 22;12(4):e1005965. doi: 10.1371/journal.pgen.1005965. eCollection 2016 Apr. PLoS Genet. 2016. PMID: 27104857 Free PMC article.
-
Practical aspects of genome-wide association interaction analysis.Hum Genet. 2014 Nov;133(11):1343-58. doi: 10.1007/s00439-014-1480-y. Epub 2014 Aug 28. Hum Genet. 2014. PMID: 25164382 Review.
-
How to increase our belief in discovered statistical interactions via large-scale association studies?Hum Genet. 2019 Apr;138(4):293-305. doi: 10.1007/s00439-019-01987-w. Epub 2019 Mar 6. Hum Genet. 2019. PMID: 30840129 Free PMC article. Review.
-
gammaMAXT: a fast multiple-testing correction algorithm.BioData Min. 2015 Nov 20;8:36. doi: 10.1186/s13040-015-0069-x. eCollection 2015. BioData Min. 2015. PMID: 26594243 Free PMC article.
-
DeepCOMBI: explainable artificial intelligence for the analysis and discovery in genome-wide association studies.NAR Genom Bioinform. 2021 Jul 20;3(3):lqab065. doi: 10.1093/nargab/lqab065. eCollection 2021 Sep. NAR Genom Bioinform. 2021. PMID: 34296082 Free PMC article.
References
-
- Van Steen K. Traveling the world of gene-gene interactions. Brief Bioinform. 2011;13:1–19. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous