Adaptive quality-based clustering of gene expression profiles

Frank De Smet¹, Janick Mathys, Kathleen Marchal, Gert Thijs, Bart De Moor, Yves Moreau

Affiliations

PMID: 12050070
DOI: 10.1093/bioinformatics/18.5.735

Adaptive quality-based clustering of gene expression profiles

Frank De Smet et al. Bioinformatics. 2002 May.

. 2002 May;18(5):735-46.

doi: 10.1093/bioinformatics/18.5.735.

Authors

Frank De Smet¹, Janick Mathys, Kathleen Marchal, Gert Thijs, Bart De Moor, Yves Moreau

Affiliation

¹ ESAT-SCD (SISTA), K.U. Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium. frank.desmet@esat.kuleuven.ac.be

PMID: 12050070
DOI: 10.1093/bioinformatics/18.5.735

Abstract

Motivation: Microarray experiments generate a considerable amount of data, which analyzed properly help us gain a huge amount of biologically relevant information about the global cellular behaviour. Clustering (grouping genes with similar expression profiles) is one of the first steps in data analysis of high-throughput expression measurements. A number of clustering algorithms have proved useful to make sense of such data. These classical algorithms, though useful, suffer from several drawbacks (e.g. they require the predefinition of arbitrary parameters like the number of clusters; they force every gene into a cluster despite a low correlation with other cluster members). In the following we describe a novel adaptive quality-based clustering algorithm that tackles some of these drawbacks.

Results: We propose a heuristic iterative two-step algorithm: First, we find in the high-dimensional representation of the data a sphere where the "density" of expression profiles is locally maximal (based on a preliminary estimate of the radius of the cluster-quality-based approach). In a second step, we derive an optimal radius of the cluster (adaptive approach) so that only the significantly coexpressed genes are included in the cluster. This estimation is achieved by fitting a model to the data using an EM-algorithm. By inferring the radius from the data itself, the biologist is freed from finding an optimal value for this radius by trial-and-error. The computational complexity of this method is approximately linear in the number of gene expression profiles in the data set. Finally, our method is successfully validated using existing data sets.

Availability: http://www.esat.kuleuven.ac.be/~thijs/Work/Clustering.html

PubMed Disclaimer

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Ovid Technologies, Inc.
- Silverchair Information Systems
Other Literature Sources
- The Lens - Patent Citations Database
Molecular Biology Databases
- Saccharomyces Genome Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Adaptive quality-based clustering of gene expression profiles

Affiliation

Adaptive quality-based clustering of gene expression profiles

Authors

Affiliation

Abstract

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases