Combining multiple clusterings using evidence accumulation
- PMID: 15943417
- DOI: 10.1109/TPAMI.2005.113
Combining multiple clusterings using evidence accumulation
Abstract
We explore the idea of evidence accumulation (EAC) for combining the results of multiple clusterings. First, a clustering ensemble--a set of object partitions, is produced. Given a data set (n objects or patterns in d dimensions), different ways of producing data partitions are: 1) applying different clustering algorithms and 2) applying the same clustering algorithm with different values of parameters or initializations. Further, combinations of different data representations (feature spaces) and clustering algorithms can also provide a multitude of significantly different data partitionings. We propose a simple framework for extracting a consistent clustering, given the various partitions in a clustering ensemble. According to the EAC concept, each partition is viewed as an independent evidence of data organization, individual data partitions being combined, based on a voting mechanism, to generate a new n x n, similarity matrix between the n patterns. The final data partition of the n patterns is obtained by applying a hierarchical agglomerative clustering algorithm on this matrix. We have developed a theoretical framework for the analysis of the proposed clustering combination strategy and its evaluation, based on the concept of mutual information between data partitions. Stability of the results is evaluated using bootstrapping techniques. A detailed discussion of an evidence accumulation-based clustering algorithm, using a split and merge strategy based on the K-means clustering algorithm, is presented. Experimental results of the proposed method on several synthetic and real data sets are compared with other combination strategies, and with individual clustering results produced by well-known clustering algorithms.
Similar articles
-
Clustering ensembles: models of consensus and weak partitions.IEEE Trans Pattern Anal Mach Intell. 2005 Dec;27(12):1866-81. doi: 10.1109/TPAMI.2005.237. IEEE Trans Pattern Anal Mach Intell. 2005. PMID: 16355656
-
Evaluation of stability of k-means cluster ensembles with respect to random initialization.IEEE Trans Pattern Anal Mach Intell. 2006 Nov;28(11):1798-808. doi: 10.1109/TPAMI.2006.226. IEEE Trans Pattern Anal Mach Intell. 2006. PMID: 17063684
-
Cumulative voting consensus method for partitions with variable number of clusters.IEEE Trans Pattern Anal Mach Intell. 2008 Jan;30(1):160-73. doi: 10.1109/TPAMI.2007.1138. IEEE Trans Pattern Anal Mach Intell. 2008. PMID: 18000332
-
Overview on techniques in cluster analysis.Methods Mol Biol. 2010;593:81-107. doi: 10.1007/978-1-60327-194-3_5. Methods Mol Biol. 2010. PMID: 19957146 Review.
-
Examining distributional characteristics of clusters.Bull Soc Sci Med Grand Duche Luxemb. 2010;Spec No 1(1):14-39. Bull Soc Sci Med Grand Duche Luxemb. 2010. PMID: 20653176 Review.
Cited by
-
Prediction of conversion from mild cognitive impairment to Alzheimer disease based on bayesian data mining with ensemble learning.Neuroradiol J. 2012 Mar;25(1):5-16. doi: 10.1177/197140091202500101. Epub 2012 Mar 1. Neuroradiol J. 2012. PMID: 24028870 Free PMC article.
-
Acoustic sequences in non-human animals: a tutorial review and prospectus.Biol Rev Camb Philos Soc. 2016 Feb;91(1):13-52. doi: 10.1111/brv.12160. Epub 2014 Nov 26. Biol Rev Camb Philos Soc. 2016. PMID: 25428267 Free PMC article. Review.
-
Automated classification of dolphin echolocation click types from the Gulf of Mexico.PLoS Comput Biol. 2017 Dec 7;13(12):e1005823. doi: 10.1371/journal.pcbi.1005823. eCollection 2017 Dec. PLoS Comput Biol. 2017. PMID: 29216184 Free PMC article.
-
Clustering Ensemble Model Based on Self-Organizing Map Network.Comput Intell Neurosci. 2020 Aug 25;2020:2971565. doi: 10.1155/2020/2971565. eCollection 2020. Comput Intell Neurosci. 2020. PMID: 32908472 Free PMC article.
-
Unsupervised analysis of whole transcriptome data from human pluripotent stem cells cardiac differentiation.Sci Rep. 2024 Feb 7;14(1):3110. doi: 10.1038/s41598-024-52970-z. Sci Rep. 2024. PMID: 38326387 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical