Learning genetic epistasis using Bayesian network scoring criteria
- PMID: 21453508
- PMCID: PMC3080825
- DOI: 10.1186/1471-2105-12-89
Learning genetic epistasis using Bayesian network scoring criteria
Abstract
Background: Gene-gene epistatic interactions likely play an important role in the genetic basis of many common diseases. Recently, machine-learning and data mining methods have been developed for learning epistatic relationships from data. A well-known combinatorial method that has been successfully applied for detecting epistasis is Multifactor Dimensionality Reduction (MDR). Jiang et al. created a combinatorial epistasis learning method called BNMBL to learn Bayesian network (BN) epistatic models. They compared BNMBL to MDR using simulated data sets. Each of these data sets was generated from a model that associates two SNPs with a disease and includes 18 unrelated SNPs. For each data set, BNMBL and MDR were used to score all 2-SNP models, and BNMBL learned significantly more correct models. In real data sets, we ordinarily do not know the number of SNPs that influence phenotype. BNMBL may not perform as well if we also scored models containing more than two SNPs. Furthermore, a number of other BN scoring criteria have been developed. They may detect epistatic interactions even better than BNMBL.Although BNs are a promising tool for learning epistatic relationships from data, we cannot confidently use them in this domain until we determine which scoring criteria work best or even well when we try learning the correct model without knowledge of the number of SNPs in that model.
Results: We evaluated the performance of 22 BN scoring criteria using 28,000 simulated data sets and a real Alzheimer's GWAS data set. Our results were surprising in that the Bayesian scoring criterion with large values of a hyperparameter called α performed best. This score performed better than other BN scoring criteria and MDR at recall using simulated data sets, at detecting the hardest-to-detect models using simulated data sets, and at substantiating previous results using the real Alzheimer's data set.
Conclusions: We conclude that representing epistatic interactions using BN models and scoring them using a BN scoring criterion holds promise for identifying epistatic genetic variants in data. In particular, the Bayesian scoring criterion with large values of a hyperparameter α appears more promising than a number of alternatives.
Figures



Similar articles
-
A Bayesian method for identifying genetic interactions.AMIA Annu Symp Proc. 2009 Nov 14;2009:673-7. AMIA Annu Symp Proc. 2009. PMID: 20351939 Free PMC article.
-
Genetic studies of complex human diseases: characterizing SNP-disease associations using Bayesian networks.BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S14. doi: 10.1186/1752-0509-6-S3-S14. Epub 2012 Dec 17. BMC Syst Biol. 2012. PMID: 23281790 Free PMC article.
-
Identifying genetic interactions in genome-wide data using Bayesian networks.Genet Epidemiol. 2010 Sep;34(6):575-81. doi: 10.1002/gepi.20514. Genet Epidemiol. 2010. PMID: 20568290 Free PMC article.
-
Genetic interactions effects for cancer disease identification using computational models: a review.Med Biol Eng Comput. 2021 Apr;59(4):733-758. doi: 10.1007/s11517-021-02343-9. Epub 2021 Apr 11. Med Biol Eng Comput. 2021. PMID: 33839998 Review.
-
Epistasis, complexity, and multifactor dimensionality reduction.Methods Mol Biol. 2013;1019:465-77. doi: 10.1007/978-1-62703-447-0_22. Methods Mol Biol. 2013. PMID: 23756906 Review.
Cited by
-
Mining pure, strict epistatic interactions from high-dimensional datasets: ameliorating the curse of dimensionality.PLoS One. 2012;7(10):e46771. doi: 10.1371/journal.pone.0046771. Epub 2012 Oct 12. PLoS One. 2012. PMID: 23071633 Free PMC article.
-
An omnibus permutation test on ensembles of two-locus analyses can detect pure epistasis and genetic heterogeneity in genome-wide association studies.Springerplus. 2013 May 19;2:230. doi: 10.1186/2193-1801-2-230. eCollection 2013. Springerplus. 2013. PMID: 24804170 Free PMC article.
-
Dynamic network-based epistasis analysis: boolean examples.Front Plant Sci. 2011 Dec 15;2:92. doi: 10.3389/fpls.2011.00092. eCollection 2011. Front Plant Sci. 2011. PMID: 22645556 Free PMC article.
-
Self-Adjusting Ant Colony Optimization Based on Information Entropy for Detecting Epistatic Interactions.Genes (Basel). 2019 Feb 1;10(2):114. doi: 10.3390/genes10020114. Genes (Basel). 2019. PMID: 30717303 Free PMC article.
-
Automated Cyber and Privacy Risk Management Toolkit.Sensors (Basel). 2021 Aug 15;21(16):5493. doi: 10.3390/s21165493. Sensors (Basel). 2021. PMID: 34450935 Free PMC article.
References
-
- Bateson W. Mendel's Principles of Heredity. New York; Cambridge University Press; 1909.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Molecular Biology Databases