Abstract
In this paper, the concept of divide and federate is evaluated to find the clusters that are different in densities and shapes and are contaminated with noise. The proposed divide-and-federate clustering method is based on the density and distance evaluation of the data. Wherein, the first phase of the algorithm divides the data into different sub-clusters based on the density evaluation with respect to all the data dimensions and, in the second phase, the small sub-clusters are federated with large sub-clusters to create the actual data clusters. The federation phase of the proposed clustering method is based on the distance evaluation of clusters and is merged based on the close proximity of neighbors. The proposed clustering algorithm is capable of handling noisy data through the integration of an outlier detection preprocessing method. The usefulness of the proposed algorithm is demonstrated with some examples of complex synthetic benchmark functions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Wang Y, Chen L, Mei J-P (2014) Incremental fuzzy clustering with multiple medoids for large data. IEEE Trans Fuzzy Syst 22(6):1557–1568
Rehman AU, Belhaouari SB (2022) Divide well to merge better: a novel clustering algorithm. Pattern Recognit 122:108305
Mittal M, Goyal LM, Hemanth DJ, Sethi JK (2019) Clustering approaches for high-dimensional databases: a review. Wiley Interdiscip Rev Data Min Knowl Discov 9(3):e1300
Sambo DW, Yenke BO, Förster A, Dayang P (2019) Optimized clustering algorithms for large wireless sensor networks: a review. Sensors (Switzerland) 19(2):322
Dutta AK, Elhoseny M, Dahiya V (2020) An efficient hierarchical clustering protocol for multihop Internet of vehicles communication. Trans Emerg Telecommun Technol 31(5):e3690
Qi R, Ma A, Ma Q, Zou Q (2020) Clustering and classification methods for single-cell RNA-sequencing data. Brief Bioinform 21(4):1196–1208
Long Y, Ren J, Chen H (2020) Unsupervised manifold clustering of topological phononics. Phys Rev Lett 124(18):185501
Kaufman L, Rousseeuw PJ (2009) Finding groups in data: an introduction to cluster analysis, vol 344. Wiley, Hoboken, NJ, USA
Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 80, 315(5814):972–976
Ng RT, Han J (2002) CLARANS: a method for clustering objects for spatial data mining. IEEE Trans Knowl Data Eng 14(5):1003–1016
Kaufman L, Rousseeuw PJ (1990) Partitioning around medoids (program PAM) finding groups data: an introduction to cluster analysis. Wiley, Hoboken NJ, USA, pp 68–125
Xu D, Tian Y (2015) A comprehensive survey of clustering algorithms. Ann Data Sci 2(2):165–193
Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. KDD 96(34):226–231
Birant D, Kut A (2007) ST-DBSCAN: an algorithm for clustering spatial-temporal data. Data Knowl Eng 60(1):208–221
Ankerst M, Breunig MM, Kriegel H-P, Sander J (1999) OPTICS: ordering points to identify the clustering structure. ACM Sigmod Rec 28(2):49–60
Gan J, Tao Y (2015) DBSCAN revisited: mis-claim un-fixability and approximation. In: Proceedings of 2015 ACM SIGMOD international conference on management data, pp 519–530
Rehman AU, Belhaouari SB (2021) Unsupervised outlier detection in multidimensional data. J Big Data 8(1)
Yadav J, Kumar D (2014) Subspace clustering using CLIQUE : an exploratory study. Int J Adv Res Comput Eng Technol 3(2):372–378
Hadar A-E, Bar N, Cohen-Or D (2019) Border-peeling clustering. IEEE Trans Pattern Anal Mach Intell 42(7):1791–1797
Rezaei M, Fränti P (2016) Set-matching measures for external cluster validity. IEEE Trans Knowl Data Eng 28(8):2173–2186
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218
Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Rehman, A.U., Belhaouari, S.B., Stanko, T., Gorovoy, V. (2023). Divide to Federate Clustering Concept for Unsupervised Learning. In: Yang, XS., Sherratt, S., Dey, N., Joshi, A. (eds) Proceedings of Seventh International Congress on Information and Communication Technology. Lecture Notes in Networks and Systems, vol 465. Springer, Singapore. https://doi.org/10.1007/978-981-19-2397-5_3
Download citation
DOI: https://doi.org/10.1007/978-981-19-2397-5_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-2396-8
Online ISBN: 978-981-19-2397-5
eBook Packages: EngineeringEngineering (R0)