On the Use of Self-Organizing Maps for Clustering and Visualization

  • Arthur Flexer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1704)

Abstract

We show that the number of output units used in a self-organizing map (SOM) influences its applicability for either clustering or visualization. By reviewing the appropriate literature and theory and own empirical results, we demonstrate that SOMs can be used for clustering or visualization separately, for simultaneous clustering and visualization, and even for clustering via visualization. For all these different kinds of application, SOM is compared to other statistical approaches. This will show SOM to be a flexible tool which can be used for various forms of explorative data analysis but it will also be made obvious that this flexibility comes with a price in terms of impaired performance. The usage of SOM in the data mining community is covered by discussing its application in the data mining tools CLEMENTINE and WEBSOM.

Keywords

Input Vector Cluster Center Output Unit Output Space Planar Grid 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Balakrishnan, P.V., Cooper, M.C., Jacob, V.S., Lewis, P.A.: A study of the classification capabilities of neural networks using unsupervised learning: a comparison with k-means clustering. Psychometrika 59(4), 509–525 (1994)MATHCrossRefGoogle Scholar
  2. 2.
    Bezdek, J.C., Nikhil, R.P.: An index of topological preservation for feature extraction. Pattern Recognition 28(3), 381–391 (1995)CrossRefGoogle Scholar
  3. 3.
    Bishop, C.M., Svensen, M., Williams, C.K.I.: Magnification factors for the SOM and GTM algorithms. In: Proc. of WSOM 1997: Workshop on Self-Organizing Maps, Helsinki, pp. 333–338 (1997)Google Scholar
  4. 4.
    Bishop, C.M., Svensen, M., Williams, C.K.I.: GTM: The Generative Topographic Mapping. Neural Computation 10(1), 215–234 (1998)CrossRefGoogle Scholar
  5. 5.
    Bottou, L., Bengio, Y.: Convergence Properties of the K-Means Algorithms. In: Tesauro, G., et al. (eds.) Advances in Neural Information Processing System, vol. 7, pp. 585–592. MIT Press, Cambridge (1995)Google Scholar
  6. 6.
    Clementine User Guide, Integral Solutions Limited (1998)Google Scholar
  7. 7.
    Cottrell, M., Fort, J.C., Pages, G.: Theoretical aspects of the SOM algorithm. Neurocomputing 1-3(21), 119–138 (1998)CrossRefGoogle Scholar
  8. 8.
    Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. John Wiley & Sons, New York (1973)MATHGoogle Scholar
  9. 9.
    Erwin, E., Obermayer, K., Schulten, K.: Self-organizing maps: ordering, convergence properties and energy functions. Biological Cybernetics 67, 47–55 (1992)MATHCrossRefGoogle Scholar
  10. 10.
    Flexer, A.: Limitations of Self-Organizing Maps for Vector Quantization and Multidimensional Scaling. In: Mozer, M.C., et al. (eds.) Advances in Neural Information Processing Systems, vol 9, pp. 445–451. MIT Press/Bradford Books (1997)Google Scholar
  11. 11.
    Jolliffe, I.T.: Principal Component Analysis. Springer, Heidelberg (1986)Google Scholar
  12. 12.
    Kohonen, T.: Self-Organization and Associative Memory. Springer, Heidelberg (1984)MATHGoogle Scholar
  13. 13.
    Kohonen, T.: Self-organizing maps, 2nd extended edn. Springer Series in Information Sciences, vol. 30. Springer, Heidelberg (1997)MATHGoogle Scholar
  14. 14.
    Kohonen, T.: Self-Organization of Very Large Document Collections: State of the Art. In: Niklasson, L., et al. (eds.) Proceedings of the 8th International Conference on Artificial Neural Networks, 2 vols., pp. 65–74. Springer, Heidelberg (1998)Google Scholar
  15. 15.
    Lagus, K., Honkela, T., Kaski, S., Kohonen, T.: Self-Organizing Maps of Document Collections: A New Approach to Interactive Exploration. In: Simoudis, E., Han, J. (eds.) KDD 1996: Proceedings Second International Conference on Knowledge Discovery & Data Mining, pp. 238–243. AAAI Press/MIT Press (1996)Google Scholar
  16. 16.
    MacQueen, J.: Some Methods for Classification and Analysis of Multivariate Observations. In: Proc. of the Fifth Berkeley Symposium on Math., Stat. and Prob., vol. 1, pp. 281–296 (1967)Google Scholar
  17. 17.
    Mezzich, J.: Evaluating clustering methods for psychiatric diagnosis. Biological Psychiatry 13, 265–346 (1978)Google Scholar
  18. 18.
    Milligan, G.W., Cooper, M.C.: An examination of procedures for determining the number of clusters in a data set. Psychometrika 50(2), 159–179 (1985)CrossRefGoogle Scholar
  19. 19.
    Ripley, B.D.: Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge (1996)MATHGoogle Scholar
  20. 20.
    Sammon, J.W.: A Nonlinear Mapping for Data Structure Analysis. IEEE Transactions on Comp. C-18(5), 401–409 (1969)CrossRefGoogle Scholar
  21. 21.
    Schwenker, F., Kestler, H., Palm, G.: Adaptive Clustering and Multidimensional Scaling of Large and High-Dimensional Data Sets. In: Niklasson, L., et al. (eds.) Proceedings of the 8th International Conference on Artificial Neural Networks, ICANN 1998, pp. 911–916. Springer, Heidelberg (1998)Google Scholar
  22. 22.
    Sneath, P.H.A.: The risk of not recognizing from ordinations that clusters are distinct. Classification Society Bulletin 4, 22–43 (1980)Google Scholar
  23. 23.
    Ultsch, A.: Self-organizing Neural Networks for Visualization and Classification. In: Opitz, O., et al. (eds.) Information and Classification, pp. 307–313. Springer, Berlin (1993)Google Scholar
  24. 24.
    Waller, N.G., Kaiser, H.A., Illian, J.B., Manry, M.: A comparison of the classification capabilities of the 1-dimensional Kohonen neural network with two partitioning and three hierarchical cluster analysis algorithms. Psychometrika 63(1), 5–22 (1998)MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Arthur Flexer
    • 1
  1. 1.The Austrian Research Institute for Artificial IntelligenceViennaAustria

Personalised recommendations