Advertisement

Text Categorization and Semantic Browsing with Self-Organizing Maps on Non-euclidean Spaces

  • Jorg Ontrup
  • Helge Ritter
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2168)

Abstract

This paper introduces a new type of Self-Organizing Map (SOM) for Text Categorization and Semantic Browsing. We propose a “hyperbolic SOM” (HSOM) based on a regular tesselation of the hyperbolic plane, which is a non-euclidean space characterized by constant negative gaussian curvature. This approach is motivated by the observation that hyperbolic spaces possess a geometry where the size of a neighborhood around a point increases exponentially and therefore provides more freedom to map a complex information space such as language into spatial relations. These theoretical findings are supported by our experiments, which show that hyperbolic SOMs can successfully be applied to text categorization and yield results comparable to other state-of-the-art methods. Furthermore we demonstrate that the HSOM is able to map large text collections in a semantically meaningful way and therefore allows a “semantic browsing” of text databases.

Keywords

Hyperbolic Space Text Categorization Hyperbolic Plane Text Collection Prototype Vector 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    D.S. Bradburn. Reducing transmission error effects using a self-organizing network. In Proc. of the IJCNN89, volume II, pages 531–538, San Diego, CA, 1989.Google Scholar
  2. 2.
    H. S. M. Coxeter. Non Euclidean Geometry. Univ. of Toronto Press, Toronto, 1957.zbMATHGoogle Scholar
  3. 3.
    R. Fricke and F. Klein. Vorlesungen über die Theorie der automorphen Funktionen, volume 1. Teubner, Leipzig, 1897. Reprinted by Johnson Reprint, New York, 1965.Google Scholar
  4. 4.
    T. Joachims. Text categorization with support vector machines: Learning with many relevant features. Technical Report LS8-Report 23, Universität Dortmund, 1997.Google Scholar
  5. 5.
    T. Joachims. Text categorization with support vector machines: learning with many relevant features. In Proceedings of ECML-98, 10th European Conference on Machine Learning, number 1398, pages 137–142, Chemnitz, DE, 1998.Google Scholar
  6. 6.
    F. Klein and R. Fricke. Vorlesungen über die Theorie der elliptischen Modulfunktionen. Teubner, Leipzig, 1890. Reprinted by Johnson Reprint, New York, 1965.Google Scholar
  7. 7.
    T. Kohonen. Self-Organizing Maps. Springer Series in Information Sciences. Springer, second edition edition, 1997.Google Scholar
  8. 8.
    Pasi Koikkalainen and Erkki Oja. Self-organizing hierarchical feature maps. In Proc. of the IJCNN 1990, volume II, pages 279–285, 1990.Google Scholar
  9. 9.
    John Lamping and Ramana Rao. Laying out and visualizing large trees using a hyperbolic space. In Proceedings of UIST’94, pages 13–14, 1994.Google Scholar
  10. 10.
    John Lamping, Ramana Rao, and Peter Pirolli. A focus+content technique based on hyperbolic geometry for viewing large hierarchies. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, Denver, May 1995. ACM.Google Scholar
  11. 11.
    W. Magnus. Noneuclidean Tesselations and Their Groups. Academic Press, 1974.Google Scholar
  12. 12.
    Charles W. Misner, J. A. Wheeler, and Kip S. Thorne. Gravitation. Freeman, 1973.Google Scholar
  13. 13.
    Frank Morgan. Riemannian Geometry: A Beginner’s Guide. Jones and Bartlett Publishers, Boston, London, 1993.zbMATHGoogle Scholar
  14. 14.
    H. Ritter, T. Martinetz, and K. Schulten. Neural Computation and Self-organizing Maps. Addison Wesley Verlag, 1992.Google Scholar
  15. 15.
    Helge Ritter. Self-organizing maps in non-euclidian spaces. In E. Oja and S. Kaski, editors, Kohonen Maps, pages 97–108. Amer Elsevier, 1999.Google Scholar
  16. 16.
    G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5):513–523, 1988.CrossRefGoogle Scholar
  17. 17.
    F. Sebastiani. Machine learning in automated text categorisation: a survey. Technical Report IEI-B4-31-1999, Istituto di Elaborazione dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, IT, 1999.Google Scholar
  18. 18.
    F. Sebastiani, A. Sperduti, and N. Valdambrini. An improved boosting algorithm and its application to automated text categorization. In Proceedings of CIKM-00, 9th ACM International Conference on Information and Knowledge Management, pages 78–85, 2000.Google Scholar
  19. 19.
    Karl Strubecker. Differentialgeometrie III: Theorie der Flachenkrummung. Walter de Gruyter & Co, Berlin, 1969.Google Scholar
  20. 20.
    J.A. Thorpe. Elementary Topics in Differential Geometry. Springer-Verlag, New York, Heidelberg, Berlin, 1979.zbMATHGoogle Scholar
  21. 21.
    Y. Yang. An evaluation of statistical approaches to text categorization. Information Retrieval, 1–2(1):69–90, 1999.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Jorg Ontrup
    • 1
  • Helge Ritter
    • 1
  1. 1.Neuroinformatics Group, Faculty of TechnologyBielefeld UniversityBielefeldGermany

Personalised recommendations