Advertisement

Swarm-Based Distributed Clustering in Peer-to-Peer Systems

  • Gianluigi Folino
  • Agostino Forestiero
  • Giandomenico Spezzano
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3871)

Abstract

Clustering can be defined as the process of partitioning a set of patterns into disjoint and homogeneous meaningful groups, called clusters. Traditional clustering methods require that all data have to be located at the site where they are analyzed and cannot be applied in the case of multiple distributed datasets. This paper describes a multi-agent algorithm for clustering distributed data in a peer-to-peer environment. The algorithm proposed is based on the biology-inspired paradigm of a flock of birds. Agents, in this context, are used to discovery clusters using a density-based approach. Swarm-based algorithms have attractive features that include adaptation, robustness and a distributed, decentralized nature, making them well-suited for clustering in p2p networks, in which it is difficult to implement centralized network control. We have applied this algorithm on synthetic and real world datasets and we have measured the impact of the flocking search strategy on performance in terms of accuracy and scalability.

Keywords

Swarm Intelligence Cellular Space Core Point Region Query Noise Point 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm intelligence: From natural to artificial systems. J. Artificial Societies and Social Simulation 4(1) (2001)Google Scholar
  2. 2.
    Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining, pp. 226–231 (1996)Google Scholar
  3. 3.
    Miller, H., Han, J.: La Reconstruction du nid et les Coordinations Inter-Individuelles chez Beellicositermes Natalensis et Cubitermes sp. La Théorie de la Stigmergie: Essai d’interprétation du Comportement des Termites Constructeurs in Insect. Soc. 6. Morgan Kaufmann, San Francisco (1959)Google Scholar
  4. 4.
    Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. In: ACM SIGMOD International Conference on Management of Data, pp. 73–84 (June 1998)Google Scholar
  5. 5.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques (The Morgan Kaufmann Series in Data Management Systems). Morgan Kaufmann, San Francisco (2000)Google Scholar
  6. 6.
    Hans-Peter, E.J.: Towards effective and efficient distributed clustering (2003)Google Scholar
  7. 7.
    Kamber, M., Han, J.: Spatial Clustering Methods in Data Mining: A Survey, Geographic Data Mining and Knowledge Discovery. Morgan Kaufmann, San Francisco (2001)Google Scholar
  8. 8.
    Karypis, G., Han, E.-H., Kumar, V.: Chameleon: Hierarchical clustering using dynamic modeling. IEEE Computer 32(8), 68–75 (1999)CrossRefGoogle Scholar
  9. 9.
    Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley, Chichester (1990)CrossRefMATHGoogle Scholar
  10. 10.
    Deneubourg, J.L., Goss, S., Franks, N., Sendova-Franks, A.C.: The dynamics of collective sorting robot-like ants and ant-like robots. In: From Animals to Animats: Proc. of the 1st Int. Conf. on Simulation of Adaptive Behaviour, MIT Press/Bradford Books (1990)Google Scholar
  11. 11.
    Monmarché, N., Slimane, M., Venturini, G.: On improving clustering in numerical databases with artificial ants. In: Floreano, D., Mondada, F. (eds.) ECAL 1999. LNCS, vol. 1674, pp. 626–635. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  12. 12.
    Reynolds, C.W.: Flocks, herds and schools: A distributed behavioral model. In: SIGGRAPH 1987: Proceedings of the 14th annual conference on Computer graphics and interactive techniques, New York, NY, USA, pp. 25–34. ACM Press, New York (1987)CrossRefGoogle Scholar
  13. 13.
    Sander, J., Ester, M., Kriegel, H.-P., Xu, X.: Density-based clustering in spatial databases: The algorithm gdbscan and its applications. Data Min. Knowl. Discov. 2(2), 169–194 (1998)CrossRefGoogle Scholar
  14. 14.
    Wang, W., Yang, J., Muntz, R.R.: Sting: A statistical information grid approach to spatial data mining. In: VLDB 1997, Proceedings of 23rd International Conference on Very Large Data Bases, Athens, Greece, August 25-29, 1997, pp. 186–195. Morgan Kaufmann, San Francisco (1997)Google Scholar
  15. 15.
    Wang, X., Hamilton, H.J.: Dbrs: A density-based spatial clustering method with random sampling. In: PAKDD, pp. 563–575 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Gianluigi Folino
    • 1
  • Agostino Forestiero
    • 1
  • Giandomenico Spezzano
    • 1
  1. 1.Institute for High Performance Computing and Networking (ICAR)Rende (CS)Italy

Personalised recommendations