Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2016 Nov 30:10:237-253.
doi: 10.4137/BBI.S38316. eCollection 2016.

Clustering Algorithms: Their Application to Gene Expression Data

Affiliations
Review

Clustering Algorithms: Their Application to Gene Expression Data

Jelili Oyelade et al. Bioinform Biol Insights. .

Abstract

Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure.

Keywords: bioinformatics; biological process; clustering algorithm; gene expression data; homology.

PubMed Disclaimer

Conflict of interest statement

Authors disclose no potential conflicts of interest.

Figures

Figure 1
Figure 1
Classification of clustering techniques.

References

    1. Pirim H, Ekşioğlu B, Perkins AD, Yüceer Ç. Clustering of high throughput gene expression data. Comput Oper Res. 2012;39(12):3046–61. - PMC - PubMed
    1. Zhao L, Zaki MJ. Tricluster: an effective algorithm for mining coherent clusters in 3d microarray data; Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data; New York, NY, USA: ACM; 2005. pp. 694–705. SIGMOD ’05.
    1. Chandrasekhar T, Thangavel K, Elayaraja E. Effective clustering algorithms for gene expression data. Int J Comput Appl. 2011;32(4):25–9.
    1. Jiang D, Tang C, Zhang A. Cluster analysis for gene expression data: a survey. IEEE Trans Knowl Data Eng. 2004;16(11):1370–86.
    1. Kerr G, Ruskin HJ, Crane M, Doolan P. Techniques for clustering gene expression data. Comput Biol Med. 2008;38(3):283–93. - PubMed