Extended Star Clustering Algorithm

  • Reynaldo J. Gil-García
  • José M. Badía-Contelles
  • Aurora Pons-Porrata
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2905)

Abstract

In this paper we propose the extended star clustering algorithm and compare it with the original star clustering algorithm. We introduce a new concept of star and as a consequence, we obtain different star-shaped clusters. The evaluation experiments on TREC data, show that the proposed algorithm outperforms the original algorithm. Our algorithm is independent of the data order and obtains a smaller number of clusters.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys 31(3) (September 1999)Google Scholar
  2. 2.
    Berkhin, P.: Survey of Clustering Data Mining Techniques, Technical Report, Accrue Software (2002)Google Scholar
  3. 3.
    Aslam, J., Pelekhov, K., Rus, D.: Static and Dynamic Information Organization with Star Clusters. In: Proceedings of the 1998 Conference on Information Knowledge Management, Baltimore, MD (1998)Google Scholar
  4. 4.
    Aslam, J., Pelekhov, K., Rus, D.: Scalable Information Organization. In: Proceedings of RIAO (2000)Google Scholar
  5. 5.
    Croft, W.B.: Clustering large files of documents using the single-link method. Journal of the American Society for Information Science, 189–195 (November 1977)Google Scholar
  6. 6.
    Voorhees, E.M.: Implementing agglomerative hierarchical clustering algorithms for use in document retrieval. Information Processing and Management 22, 465–476 (1986)CrossRefGoogle Scholar
  7. 7.
    Mc Queen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)Google Scholar
  8. 8.
    Cutting, D., Karger, D., Pedersen, J.: Constant interaction-time Scatter/Gather browsing of very large document collections. In: Proceedings of the 16th SIGIR (1993)Google Scholar
  9. 9.
    Charikar, M., Chekuri, C., Feder, T., Motwani, R.: Incremental clustering and dynamic information retrieval. In: Proceedings of the 29th Symposium on Theory of Computing (1997)Google Scholar
  10. 10.
    Cormer, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms. McGraw-Hill, New York (1993)Google Scholar
  11. 11.
  12. 12.
    Larsen, B., Aone, C.: Fast and Effective Text Mining Using Linear-time Document Clustering. In: KDD 1999, San Diego, California, pp. 16–22 (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Reynaldo J. Gil-García
    • 1
  • José M. Badía-Contelles
    • 2
  • Aurora Pons-Porrata
    • 1
  1. 1.Universidad de OrienteSantiago de CubaCuba
  2. 2.Universitat Jaume ICastellónSpain

Personalised recommendations