Advertisement

An Immune Network for Contextual Text Data Clustering

  • Krzysztof Ciesielski
  • Sławomir T. Wierzchoń
  • Mieczysław A. Kłopotek
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4163)

Abstract

We present a novel approach to incremental document maps creation, which relies upon partition of a given collection of documents into a hierarchy of homogeneous groups of documents represented by different sets of terms. Further each group (defining in fact separate context) is explored by a modified version of the aiNet immune algorithm to extract its inner structure. The immune cells produced by the algorithm become reference vectors used in preparation of the final document map. Such an approach proves to be robust in terms of time and space requirements as well as the quality of the resulting clustering model.

Keywords

Quantization Error Contextual Model Reference Vector Immune Network Immune Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baraldi, A., Blonda, P.: A survey of fuzzy clustering algorithms for pattern recognition. IEEE Trans. on Systems, Man and Cybernetics 29B, 786–801 (1999)Google Scholar
  2. 2.
    Becks, A.: Visual Knowledge Management with Adaptable Document Maps. GMD research series 15 (2001) ISBN 3-88457-398-5Google Scholar
  3. 3.
    Berry, M.W., Drmač, Z., Jessup, E.R.: Matrices, vector spaces and information retrieval. SIAM Review 41(2), 335–362Google Scholar
  4. 4.
    Bezdek, J.C., Pal, S.K.: Fuzzy Models for Pattern Recognition: Methods that Search for Structures in Data. IEEE, New York (1992)Google Scholar
  5. 5.
    Boulis, C., Ostendorf, M.: Combining multiple clustering systems. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS, vol. 3202, pp. 63–74. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  6. 6.
    Ciesielski, K., et al.: Adaptive document maps. In: Proceedings of the Intelligent Information Processing and Web Mining (IIS:IIPWM 2006), Ustron (2006)Google Scholar
  7. 7.
    de Castro, L.N., von Zuben, F.J.: An evolutionary immune network for data clustering. In: SBRN 2000. IEEE Computer Society Press, Los Alamitos (2000)Google Scholar
  8. 8.
    de Castro, L.N., Timmis, J.: Artificial Immune Systems: A New Computational Intelligence Approach. Springer, Heidelberg (2002)MATHGoogle Scholar
  9. 9.
    Fritzke, B.: Some competitive learning methods. Draft available from, http://www.neuroinformatik.ruhr-uni-bochum.de/ini/VDM/research/gsn/JavaPaper
  10. 10.
    Gilchrist, M.: Taxonomies for business: Description of a research project. In: 11 Nordic Conference on Information and Documentation, Reykjavik, Iceland, May 30 – June 1 (2001), http://www.bokis.is/iod2001/papers/Gilchrist_paper.doc
  11. 11.
    Hung, C., Wermter, S.: A constructive and hierarchical self-organising model in a non-stationary environment. In: Int. Joint Conference in Neural Networks (2005)Google Scholar
  12. 12.
    Kłopotek, M., Dramiński, M., Ciesielski, K., Kujawiak, M., Wierzchoń, S.T.: Mining document maps. In: Gori, M., Celi, M., Nanni, M. (eds.) Proceedings of Statistical Approaches to Web Mining Workshop (SAWM) at PKDD 2004, Pisa, pp. 87–98 (2004)Google Scholar
  13. 13.
    Kłopotek, M., Wierzchoń, S., Ciesielski, K., Dramiński, M., Czerski, D.: Conceptual Maps and Intelligent Navigation in Document Space (in Polish). Akademicka Oficyna Wydawnicza EXIT Publishing, Warszawa (to appear, 2006)Google Scholar
  14. 14.
    Kohonen, T.: Self-Organizing Maps. Springer Series in Information Sciences, vol. 30. Springer, Heidelberg (2001)MATHGoogle Scholar
  15. 15.
    Lagus, K., Kaski, S., Kohonen, T.: Mining massive document collections by the WEBSOM method Information Sciences, vol. 163(1-3), pp. 135–156 (2004)Google Scholar
  16. 16.
    van Rijsbergen, C.J.: Information Retrieval. Butterworths, London (1979), http://www.dcs.gla.ac.uk/Keith/Preface.html Google Scholar
  17. 17.
    Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38, 257–286 (2000)MATHCrossRefGoogle Scholar
  18. 18.
    Zhang, T., Ramakrishan, R., Livny, M.: BIRCH: Efficient data clustering method for large databases. In: Proc. ACM SIGMOD Int. Conf. on Data Management (1997)Google Scholar
  19. 19.
    Zhao, Y., Karypis, G.: Criterion functions for document clustering: Experiments and analysis, http://www-users.cs.umn.edu/~karypis/publications/ir.html

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Krzysztof Ciesielski
    • 1
  • Sławomir T. Wierzchoń
    • 1
  • Mieczysław A. Kłopotek
    • 1
  1. 1.Institute of Computer SciencePolish Academy of SciencesWarszawaPoland

Personalised recommendations