Integrating domain knowledge with statistical and data mining methods for high-density genomic SNP disease association analysis
- PMID: 17625973
- DOI: 10.1016/j.jbi.2007.06.002
Integrating domain knowledge with statistical and data mining methods for high-density genomic SNP disease association analysis
Abstract
Genome-wide association studies can help identify multi-gene contributions to disease. As the number of high-density genomic markers tested increases, however, so does the number of loci associated with disease by chance. Performing a brute-force test for the interaction of four or more high-density genomic loci is unfeasible given the current computational limitations. Heuristics must be employed to limit the number of statistical tests performed. In this paper we explore the use of biological domain knowledge to supplement statistical analysis and data mining methods to identify genes and pathways associated with disease. We describe Pathway/SNP, a software application designed to help evaluate the association between pathways and disease. Pathway/SNP integrates domain knowledge--SNP, gene and pathway annotation from multiple sources--with statistical and data mining algorithms into a tool that can be used to explore the etiology of complex diseases.
Similar articles
-
LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources.Bioinformatics. 2005 Jun 15;21(12):2814-20. doi: 10.1093/bioinformatics/bti442. Epub 2005 Apr 12. Bioinformatics. 2005. PMID: 15827081
-
An agent- and ontology-based system for integrating public gene, protein, and disease databases.J Biomed Inform. 2007 Feb;40(1):17-29. doi: 10.1016/j.jbi.2006.02.014. Epub 2006 Mar 20. J Biomed Inform. 2007. PMID: 16621723
-
MutaGeneSys: estimating individual disease susceptibility based on genome-wide SNP array data.Bioinformatics. 2008 Feb 1;24(3):440-2. doi: 10.1093/bioinformatics/btm587. Epub 2007 Nov 29. Bioinformatics. 2008. PMID: 18048395
-
Towards knowledge-based gene expression data mining.J Biomed Inform. 2007 Dec;40(6):787-802. doi: 10.1016/j.jbi.2007.06.005. Epub 2007 Jun 21. J Biomed Inform. 2007. PMID: 17683991 Review.
-
[Advances in high-density whole genome-wide single nucleotide polymorphism array in cancer research].Ai Zheng. 2006 Nov;25(11):1454-8. Ai Zheng. 2006. PMID: 17094921 Review. Chinese.
Cited by
-
Pathway-based identification of SNPs predictive of survival.Eur J Hum Genet. 2011 Jun;19(6):704-9. doi: 10.1038/ejhg.2011.3. Epub 2011 Feb 2. Eur J Hum Genet. 2011. PMID: 21368918 Free PMC article.
-
Pathway based analysis of genotypes in relation to alcohol dependence.Pharmacogenomics J. 2012 Aug;12(4):342-8. doi: 10.1038/tpj.2011.10. Epub 2011 Apr 5. Pharmacogenomics J. 2012. PMID: 21468025 Free PMC article.
-
Estimating Gaussian Copulas with Missing Data with and without Expert Knowledge.Entropy (Basel). 2022 Dec 19;24(12):1849. doi: 10.3390/e24121849. Entropy (Basel). 2022. PMID: 36554254 Free PMC article.
-
Genomic variation in myeloma: design, content, and initial application of the Bank On A Cure SNP Panel to detect associations with progression-free survival.BMC Med. 2008 Sep 8;6:26. doi: 10.1186/1741-7015-6-26. BMC Med. 2008. PMID: 18778477 Free PMC article.
-
Analysis of SLCO1B1 and APOE genetic polymorphisms in a large ethnic Hakka population in southern China.J Clin Lab Anal. 2018 Jul;32(6):e22408. doi: 10.1002/jcla.22408. Epub 2018 Feb 9. J Clin Lab Anal. 2018. PMID: 29424099 Free PMC article.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Research Materials