Advertisement

Text Data Clustering by Contextual Graphs

  • Krzysztof Ciesielski
  • Mieczysław A. Kłopotek
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4265)

Abstract

In this paper, we focus on the class of graph-based clustering models, such as growing neural gas or idiotypic nets for the purpose of high-dimensional text data clustering. We present a novel approach, which does not require operation on the complex overall graph of clusters, but rather allows to shift majority of effort to context-sensitive, local subgraph and local sub-space processing. Savings of orders of magnitude in processing time and memory can be achieved, while the quality of clusters is improved, as presented experiments demonstrate.

Keywords

Minimal Span Tree Contextual Model Document Cluster Contextual Group Contextual Approach 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bezdek, J.C., Pal, S.K.: Fuzzy Models for Pattern Recognition: Methods that Search for Structures in Data. IEEE, New York (1992)Google Scholar
  2. 2.
    Ciesielski, K., Draminski, M., Klopotek, M., Kujawiak, M., Wierzchon, S.: Mapping document collections in non-standard geometries. In: De Beats, B., De Caluwe, R., de Tre, G., Fodor, J., Kacprzyk, J., Zadrony, S. (eds.) Current Issues in Data and Knowledge Engineering, pp. 122–132. Akademicka Oficyna Wydawnicza EXIT Publishing, Warszawa (2004)Google Scholar
  3. 3.
    Ciesielski, K., Wierzchoń, S.T., Kłopotek, M.A.: An Immune Network for Contextual Text Data Clustering. In: Bersini, H., Carneiro, J. (eds.) ICARIS 2006. LNCS, vol. 4163, pp. 432–445. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  4. 4.
    Dittenbach, M., Rauber, A., Merkl, D.: Uncovering hierarchical structure in data using the Growing Hierarchical Self-Organizing Map. Neurocomputing 48(1-4), 199–216 (2002)zbMATHCrossRefGoogle Scholar
  5. 5.
    Dorigo, M., Di Caro, G.: The Ant Colony Optimization Meta-Heuristic. In: Corne, D., Dorigo, M., Glover, F. (eds.) New Ideas in Optimization, pp. 11–32. McGraw-Hill, New York (1999)Google Scholar
  6. 6.
    Fritzke, B.: A growing neural gas network learns topologies. In: Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 625–632. MIT Press, Cambridge (1995)Google Scholar
  7. 7.
    Fritzke, B.: A self-organizing network that can follow non-stationary distributions. In: Fritzke, B. (ed.) Proceeding of the International Conference on Artificial Neural Networks 1997, pp. 613–618. Springer, Heidelberg (1997)Google Scholar
  8. 8.
    Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. Journal of Intelligent Information Systems 17(2-3), 107–145 (2001)zbMATHCrossRefGoogle Scholar
  9. 9.
    Hung, C., Wermter, S.: A constructive and hierarchical self-organising model in a non-stationary environment. In: International Joint Conference in Neural Networks (2005)Google Scholar
  10. 10.
    Klopotek, M., Draminski, M., Ciesielski, K., Kujawiak, M., Wierzchon, S.T.: Mining document maps. In: Gori, M., Celi, M., Nanni, M. (eds.) Proceedings of Statistical Approaches to Web Mining Workshop (SAWM) at PKDD 2004, Pisa, pp. 87–98 (2004)Google Scholar
  11. 11.
    Klopotek, M., Wierzchon, S., Ciesielski, K., Draminski, M., Czerski, D.: Conceptual maps and intelligent navigation in document space, monography. Akademicka Oficyna Wydawnicza EXIT Publishing, Warszawa (to appear, 2006)Google Scholar
  12. 12.
    Kohonen, T.: Self-Organizing Maps. Springer Series in Information Sciences, vol. 30. Springer, Heidelberg (2001)zbMATHGoogle Scholar
  13. 13.
    Kohonen, T., Kaski, S., Somervuo, P., Lagus, K., Oja, M., Paatero, V.: Self-organization of very large document collections, Helsinki University of Technology technical report (2003), http://www.cis.hut.fi/research/reports/biennial02-03
  14. 14.
    Rauber, A.: Cluster Visualization in Unsupervised Neural Networks. Diplomarbeit, Technische Universität Wien, Austria (1996)Google Scholar
  15. 15.
    Zhao, Y., Karypis, G.: Criterion functions for document clustering: Experiments and analysis, available at: http://www.users.cs.umn.edu/~karypis/publications/ir.html

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Krzysztof Ciesielski
    • 1
  • Mieczysław A. Kłopotek
    • 1
  1. 1.Institute of Computer SciencePolish Academy of SciencesWarszawaPoland

Personalised recommendations