On the Accuracy and Parallelism of GPGPU-Powered Incremental Clustering Algorithms
- PMID: 29123546
- PMCID: PMC5662818
- DOI: 10.1155/2017/2519782
On the Accuracy and Parallelism of GPGPU-Powered Incremental Clustering Algorithms
Abstract
Incremental clustering algorithms play a vital role in various applications such as massive data analysis and real-time data processing. Typical application scenarios of incremental clustering raise high demand on computing power of the hardware platform. Parallel computing is a common solution to meet this demand. Moreover, General Purpose Graphic Processing Unit (GPGPU) is a promising parallel computing device. Nevertheless, the incremental clustering algorithm is facing a dilemma between clustering accuracy and parallelism when they are powered by GPGPU. We formally analyzed the cause of this dilemma. First, we formalized concepts relevant to incremental clustering like evolving granularity. Second, we formally proved two theorems. The first theorem proves the relation between clustering accuracy and evolving granularity. Additionally, this theorem analyzes the upper and lower bounds of different-to-same mis-affiliation. Fewer occurrences of such mis-affiliation mean higher accuracy. The second theorem reveals the relation between parallelism and evolving granularity. Smaller work-depth means superior parallelism. Through the proofs, we conclude that accuracy of an incremental clustering algorithm is negatively related to evolving granularity while parallelism is positively related to the granularity. Thus the contradictory relations cause the dilemma. Finally, we validated the relations through a demo algorithm. Experiment results verified theoretical conclusions.
Figures






References
-
- Wang P., Zhang P., Zhou C., Li Z., Yang H. Hierarchical evolving Dirichlet processes for modeling nonlinear evolutionary traces in temporal data. Data Mining and Knowledge Discovery. 2017;31(1):32–64. doi: 10.1007/s10618-016-0454-1. - DOI
-
- Ramírez-Gallego S., Krawczyk B., García S., Woźniak M., Herrera F. A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing. 2017;239:39–57. doi: 10.1016/j.neucom.2017.01.078. - DOI
-
- García S., Luengo J., Herrera F. Data Preprocessing in Data Mining. Springer; 2015. - DOI
-
- Ordoñez A., Ordoñez H., Corrales J. C., Cobos C., Wives L. K., Thom L. H. Grouping of business processes models based on an incremental clustering algorithm using fuzzy similarity and multimodal search. Expert Systems with Applications. 2017;67:163–177. doi: 10.1016/j.eswa.2016.08.061. - DOI
-
- Chen C., Mu D., Zhang H., Hong B. A GPU-accelerated approximate algorithm for incremental learning of Gaussian mixture model. Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops (IPDPSW '12); May 2012; Shanghai, China. pp. 1937–1943. - DOI
LinkOut - more resources
Full Text Sources
Other Literature Sources