Swarm-Based Distributed Clustering in Peer-to-Peer Systems
Clustering can be defined as the process of partitioning a set of patterns into disjoint and homogeneous meaningful groups, called clusters. Traditional clustering methods require that all data have to be located at the site where they are analyzed and cannot be applied in the case of multiple distributed datasets. This paper describes a multi-agent algorithm for clustering distributed data in a peer-to-peer environment. The algorithm proposed is based on the biology-inspired paradigm of a flock of birds. Agents, in this context, are used to discovery clusters using a density-based approach. Swarm-based algorithms have attractive features that include adaptation, robustness and a distributed, decentralized nature, making them well-suited for clustering in p2p networks, in which it is difficult to implement centralized network control. We have applied this algorithm on synthetic and real world datasets and we have measured the impact of the flocking search strategy on performance in terms of accuracy and scalability.
KeywordsSwarm Intelligence Cellular Space Core Point Region Query Noise Point
Unable to display preview. Download preview PDF.
- 1.Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm intelligence: From natural to artificial systems. J. Artificial Societies and Social Simulation 4(1) (2001)Google Scholar
- 2.Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining, pp. 226–231 (1996)Google Scholar
- 3.Miller, H., Han, J.: La Reconstruction du nid et les Coordinations Inter-Individuelles chez Beellicositermes Natalensis et Cubitermes sp. La Théorie de la Stigmergie: Essai d’interprétation du Comportement des Termites Constructeurs in Insect. Soc. 6. Morgan Kaufmann, San Francisco (1959)Google Scholar
- 4.Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. In: ACM SIGMOD International Conference on Management of Data, pp. 73–84 (June 1998)Google Scholar
- 5.Han, J., Kamber, M.: Data Mining: Concepts and Techniques (The Morgan Kaufmann Series in Data Management Systems). Morgan Kaufmann, San Francisco (2000)Google Scholar
- 6.Hans-Peter, E.J.: Towards effective and efficient distributed clustering (2003)Google Scholar
- 7.Kamber, M., Han, J.: Spatial Clustering Methods in Data Mining: A Survey, Geographic Data Mining and Knowledge Discovery. Morgan Kaufmann, San Francisco (2001)Google Scholar
- 10.Deneubourg, J.L., Goss, S., Franks, N., Sendova-Franks, A.C.: The dynamics of collective sorting robot-like ants and ant-like robots. In: From Animals to Animats: Proc. of the 1st Int. Conf. on Simulation of Adaptive Behaviour, MIT Press/Bradford Books (1990)Google Scholar
- 14.Wang, W., Yang, J., Muntz, R.R.: Sting: A statistical information grid approach to spatial data mining. In: VLDB 1997, Proceedings of 23rd International Conference on Very Large Data Bases, Athens, Greece, August 25-29, 1997, pp. 186–195. Morgan Kaufmann, San Francisco (1997)Google Scholar
- 15.Wang, X., Hamilton, H.J.: Dbrs: A density-based spatial clustering method with random sampling. In: PAKDD, pp. 563–575 (2003)Google Scholar