Clique-based data mining for related genes in a biomedical database
- PMID: 19566964
- PMCID: PMC2721841
- DOI: 10.1186/1471-2105-10-205
Clique-based data mining for related genes in a biomedical database
Abstract
Background: Progress in the life sciences cannot be made without integrating biomedical knowledge on numerous genes in order to help formulate hypotheses on the genetic mechanisms behind various biological phenomena, including diseases. There is thus a strong need for a way to automatically and comprehensively search from biomedical databases for related genes, such as genes in the same families and genes encoding components of the same pathways. Here we address the extraction of related genes by searching for densely-connected subgraphs, which are modeled as cliques, in a biomedical relational graph.
Results: We constructed a graph whose nodes were gene or disease pages, and edges were the hyperlink connections between those pages in the Online Mendelian Inheritance in Man (OMIM) database. We obtained over 20,000 sets of related genes (called 'gene modules') by enumerating cliques computationally. The modules included genes in the same family, genes for proteins that form a complex, and genes for components of the same signaling pathway. The results of experiments using 'metabolic syndrome'-related gene modules show that the gene modules can be used to get a coherent holistic picture helpful for interpreting relations among genes.
Conclusion: We presented a data mining approach extracting related genes by enumerating cliques. The extracted gene sets provide a holistic picture useful for comprehending complex disease mechanisms.
Figures


Similar articles
-
Integrative mining of traditional Chinese medicine literature and MEDLINE for functional gene networks.Artif Intell Med. 2007 Oct;41(2):87-104. doi: 10.1016/j.artmed.2007.07.007. Epub 2007 Sep 5. Artif Intell Med. 2007. PMID: 17804209
-
Searching Online Mendelian Inheritance in Man (OMIM) for information on genetic loci involved in human disease.Curr Protoc Hum Genet. 2012 Apr;Chapter 9:9.13.1-9.13.10. doi: 10.1002/0471142905.hg0913s73. Curr Protoc Hum Genet. 2012. PMID: 22470145
-
Text-based knowledge discovery: search and mining of life-sciences documents.Drug Discov Today. 2002 Jun 1;7(11):S89-98. doi: 10.1016/s1359-6446(02)02286-9. Drug Discov Today. 2002. PMID: 12047886 Review.
-
CGMIM: automated text-mining of Online Mendelian Inheritance in Man (OMIM) to identify genetically-associated cancers and candidate genes.BMC Bioinformatics. 2005 Mar 29;6:78. doi: 10.1186/1471-2105-6-78. BMC Bioinformatics. 2005. PMID: 15796777 Free PMC article.
-
The importance of biological databases in biological discovery.Curr Protoc Bioinformatics. 2006 Mar;Chapter 1:Unit 1.1. doi: 10.1002/0471250953.bi0101s13. Curr Protoc Bioinformatics. 2006. PMID: 18428753 Review.
Cited by
-
Development of a novel clustering tool for linear peptide sequences.Immunology. 2018 Nov;155(3):331-345. doi: 10.1111/imm.12984. Epub 2018 Aug 6. Immunology. 2018. PMID: 30014462 Free PMC article.
-
Both simulation and sequencing data reveal coinfections with multiple SARS-CoV-2 variants in the COVID-19 pandemic.Comput Struct Biotechnol J. 2022;20:1389-1401. doi: 10.1016/j.csbj.2022.03.011. Epub 2022 Mar 18. Comput Struct Biotechnol J. 2022. PMID: 35342534 Free PMC article.
-
Information discovery on electronic health records using authority flow techniques.BMC Med Inform Decis Mak. 2010 Oct 22;10:64. doi: 10.1186/1472-6947-10-64. BMC Med Inform Decis Mak. 2010. PMID: 20969780 Free PMC article.
-
Structural similarities between brain and linguistic data provide evidence of semantic relations in the brain.PLoS One. 2013 Jun 14;8(6):e65366. doi: 10.1371/journal.pone.0065366. Print 2013. PLoS One. 2013. PMID: 23799009 Free PMC article.
-
Clustering cliques for graph-based summarization of the biomedical research literature.BMC Bioinformatics. 2013 Jun 7;14:182. doi: 10.1186/1471-2105-14-182. BMC Bioinformatics. 2013. PMID: 23742159 Free PMC article.
References
-
- Jensen LJ, Saric J, Bork P. Literature mining for the biologist: from information retrieval to biological discovery. Nat Rev Genet. 2006;7:119–129. - PubMed
-
- Matsunaga T, Muramatsu M. Disease-related concept mining by knowledge-based two-dimensional gene mapping. J Bioinform Comput Biol. 2007;5:1047–1067. - PubMed
-
- Hamosh A, Scott AF, Amberger J, Valle D, McKusick VA. Online Mendelian Inheritance in Man (OMIM) Hum Mutat. 2000;15:57–61. - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources