Federated k-means based on clusters backbone
- PMID: 40504823
- PMCID: PMC12161523
- DOI: 10.1371/journal.pone.0326145
Federated k-means based on clusters backbone
Abstract
Federated clustering is a distributed clustering algorithm that does not require the transmission of raw data and is widely used. However, it struggles to handle Non-IID data effectively because it is difficult to obtain accurate global consistency measures under Non-Independent and Identically Distributed (Non-IID) conditions. To address this issue, we propose a federated k-means clustering algorithm based on a cluster backbone called FKmeansCB. First, we add Laplace noise to all the local data, and run k-means clustering on the client side to obtain cluster centers, which faithfully represent the cluster backbone (i.e., the data structures of the clusters). The cluster backbone represents the client's features and can approximatively capture the features of different labeled data points in Non-IID situations. We then upload these cluster centers to the server. Subsequently, the server aggregates all cluster centers and runs the k-means clustering algorithm to obtain global cluster centers, which are then sent back to the client. Finally, the client assigns all data points to the nearest global cluster center to produce the final clustering results. We have validated the performance of our proposed algorithm using six datasets, including the large-scale MNIST dataset. Compared with the leading non-federated and federated clustering algorithms, FKmeansCB offers significant advantages in both clustering accuracy and running time.
Copyright: © 2025 Deng et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Conflict of interest statement
No authors have competing interests.
Figures
References
-
- Ikotun AM, Ezugwu AE, Abualigah L, Abuhaija B, Heming J. K-means clustering algorithms: a comprehensive review, variants analysis, and advances in the era of big data. Inf Sci. 2023;622:178–210. doi: 10.1016/j.ins.2022.11.139 - DOI
-
- Liu X. Simplemkkm: simple multiple kernel k-means. IEEE Trans Pattern Anal Mach Intell. 2022;45:5174–86. - PubMed
-
- Xia C, Hua J, Tong W, Zhong S. Distributed K-means clustering guaranteeing local differential privacy. Comput Secur. 2020;90:101699.
-
- Tabianan K, Velu S, Ravi V. K-means clustering approach for intelligent customer segmentation using customer purchase behavior data. Sustainability. 2022;14(2022):7243.
MeSH terms
LinkOut - more resources
Full Text Sources
