Biclustering in gene expression data by tendency
- PMID: 16448012
- DOI: 10.1109/csb.2004.1332431
Biclustering in gene expression data by tendency
Abstract
The advent of DNA microarray technologies has revolutionized the experimental study of gene expression. Clustering is the most popular approach of analyzing gene expression data and has indeed proven to be successful in many applications. Our work focuses on discovering a subset of genes which exhibit similar expression patterns along a subset of conditions in the gene expression matrix. Specifically, we are looking for the Order Preserving clusters (OPCluster), in each of which a subset of genes induce a similar linear ordering along a subset of conditions. The pioneering work of the OPSM model[3], which enforces the strict order shared by the genes in a cluster, is included in our model as a special case. Our model is more robust than OPSM because similarly expressed conditions are allowed to form order equivalent groups and no restriction is placed on the order within a group. Guided by our model, we design and implement a deterministic algorithm, namely OPCTree, to discover OP-Clusters. Experimental study on two real datasets demonstrates the effectiveness of the algorithm in the application of tissue classification and cell cycle identification. In addition, a large percentage of OP-Clusters exhibit significant enrichment of one or more function categories, which implies that OP-Clusters indeed carry significant biological relevance.
Similar articles
-
On mining micro-array data by Order-Preserving Submatrix.Int J Bioinform Res Appl. 2007;3(1):42-64. doi: 10.1504/IJBRA.2007.011834. Int J Bioinform Res Appl. 2007. PMID: 18048172
-
Identification of coherent patterns in gene expression data using an efficient biclustering algorithm and parallel coordinate visualization.BMC Bioinformatics. 2008 Apr 23;9:210. doi: 10.1186/1471-2105-9-210. BMC Bioinformatics. 2008. PMID: 18433478 Free PMC article.
-
Parallelized evolutionary learning for detection of biclusters in gene expression data.IEEE/ACM Trans Comput Biol Bioinform. 2012;9(2):560-70. doi: 10.1109/TCBB.2011.53. Epub 2011 Mar 3. IEEE/ACM Trans Comput Biol Bioinform. 2012. PMID: 21383419
-
Recent patents on biclustering algorithms for gene expression data analysis.Recent Pat DNA Gene Seq. 2011 Aug;5(2):117-25. doi: 10.2174/187221511796392097. Recent Pat DNA Gene Seq. 2011. PMID: 21529337 Review.
-
Unsupervised pattern recognition: an introduction to the whys and wherefores of clustering microarray data.Brief Bioinform. 2005 Dec;6(4):331-43. doi: 10.1093/bib/6.4.331. Brief Bioinform. 2005. PMID: 16420732 Review.
Cited by
-
BicSPAM: flexible biclustering using sequential patterns.BMC Bioinformatics. 2014 May 6;15:130. doi: 10.1186/1471-2105-15-130. BMC Bioinformatics. 2014. PMID: 24885271 Free PMC article.
-
A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series.Algorithms Mol Biol. 2009 Jun 4;4:8. doi: 10.1186/1748-7188-4-8. Algorithms Mol Biol. 2009. PMID: 19497096 Free PMC article.
-
Efficient Mining of Discriminative Co-clusters from Gene Expression Data.Knowl Inf Syst. 2014 Dec;41(3):667-696. doi: 10.1007/s10115-013-0684-0. Knowl Inf Syst. 2014. PMID: 25642010 Free PMC article.
-
UniBic: Sequential row-based biclustering algorithm for analysis of gene expression data.Sci Rep. 2016 Mar 22;6:23466. doi: 10.1038/srep23466. Sci Rep. 2016. PMID: 27001340 Free PMC article.
MeSH terms
LinkOut - more resources
Miscellaneous