Advertisement

Self-organizing Maps as Substitutes for K-Means Clustering

  • Fernando Bação
  • Victor Lobo
  • Marco Painho
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3516)

Abstract

One of the most widely used clustering techniques used in GISc problems is the k-means algorithm. One of the most important issues in the correct use of k-means is the initialization procedure that ultimately determines which part of the solution space will be searched. In this paper we briefly review different initialization procedures, and propose Kohonen’s Self-Organizing Maps as the most convenient method, given the proper training parameters. Furthermore, we show that in the final stages of its training procedure the Self-Organizing Map algorithms is rigorously the same as the k-means algorithm. Thus we propose the use of Self-Organizing Maps as possible substitutes for the more classical k-means clustering algorithms.

Keywords

Geographical Information System Cluster Center Quadratic Error Enumeration District Initial Centroid 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. Al-Daoud, M., Roberts, S.: New Methods for the Initialisation of Clusters. Leeds, University of Leeds 14 (1994)Google Scholar
  2. Bação, F., Lobo, V., Painho, M.: The Self-Organizing Map, Geo-SOM, and relevant variants for GeoSciences. In: Computers & Geosciences, vol. 31, pp. 155–163. Elsevier, Amsterdam (2005)Google Scholar
  3. Balakrishnan, P.V., Cooper, M.C., Jacob, V.S., Lewis, P.A.: A study of the classification capabilities of neural networks using unsupervised learning: a comparison with k-means clustering. Psychometrika 59(4), 509–525 (1994)zbMATHCrossRefGoogle Scholar
  4. Batty, M., Longley, P.: Analytical GIS: The Future. In: Longley, P., Batty, M. (eds.) Spatial Analysis: Modelling in a GIS Environment, pp. 345–352. Geoinformation International, Cambridge (1996)Google Scholar
  5. Birkin, M., Clarke, G.: GIS, geodemographics and spatial modeling in the UK financial service industry. Journal of Housing Research 9, 87–111 (1998)Google Scholar
  6. Birkin, M., Clarke, G., Clarke, M.: GIS for Business and Service Planning. In: Goodchild, M., Longley, P., Maguire, D., Rhind, D. (eds.) Geographical Information Systems, Geoinformation, Cambridge (1999)Google Scholar
  7. Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)Google Scholar
  8. Bodt, E.d., Cottrell, M., Verleysen, M.: Using the Kohonen Algorithm for Quick Initialization of Simple Competitive Learning Algorithms. In: ESANN 1999, Bruges (1999)Google Scholar
  9. Bodt, E.d., Verleysen, M., Cottrell, M.: Kohonen Maps versus Vector Quantization for Data Analysis. In: ESANN 1997, Bruges (1997)Google Scholar
  10. Bottou, L., Bengio, Y.: Convergence Properties of the K-Means Algorithms. In: Advances in Neural Information Processing System, vol. 7G, pp. 585–592. MIT Press, Cambridge (1995)Google Scholar
  11. Bradley, P., Fayyad, U.: Refining initial points for K-means clustering. In: International Conference on Machine Learning, ICML 1998 (1998)Google Scholar
  12. Cano, J.R., Cordón, O., Herrera, F., Sánchez, L.: A Greedy Randomized Adaptive Search Procedure Applied to the Clustering Problem as an Initialization Process Using K-Means as a Local Search Procedure. International Journal of Intelligent and Fuzzy Systems 12, 235–242 (2002)zbMATHGoogle Scholar
  13. Duda, R.O., Hart, P.E., Stork, D.: Pattern Classification. Wiley-Interscience, Hoboken (2001)zbMATHGoogle Scholar
  14. Fahmy, E., Gordon, D., Cemlyn, S.: Poverty and Neighbourhood Renewal in West Cornwall. In: Social Policy Association Annual Conference, Nottingham, UK (2002)Google Scholar
  15. Feng, Z., Flowerdew, R.: Fuzzy geodemographics: a contribution from fuzzy clustering methods. In: Carver, S. (ed.) Innovations in GIS 5, pp. 119–127. Taylor & Francis, London (1998)Google Scholar
  16. Fisher, D.H.: Knowledge Acquisition Via Incremental Conceptual Clustering. Machine Learning 2, 139–172 (1987)Google Scholar
  17. Fisher, D.H., Xu, L., Zard, N.: Ordering effects in clustering. In: Ninth International Conference on Machine Learning, San Mateo, CA (1992)Google Scholar
  18. Fisher, R.A.: The use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics VII(II), 179–188 (1936)Google Scholar
  19. Flexer, A.: On the use of self-organizing maps for clustering and visualization. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 80–88. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  20. Fukunaga, K.: Introduction to statistical patterns recognition. Academic Press Inc., London (1990)Google Scholar
  21. Han, J., Kamber, M., Tung, A.: Spatial clustering methods in data mining. In: Miller, H., Han, J. (eds.) Geographic Data Mining and Knowledge Discovery, pp. 188–217. Taylor & Fancis, London (2001)CrossRefGoogle Scholar
  22. Higgs, R.E., Bemis, K.G., Watson, I., Wikel, J.: Experimental Designs for Selecting Molecules from Large Chemical Databases. Journal of Chemical Information and Computer Sciences 37(5), 861–870 (1997)Google Scholar
  23. Jain, A.K., Dubes, R.C.: Algorithms for clustering data. Prentice-Hall, Englewood Cliffs (1988)zbMATHGoogle Scholar
  24. Jain, A.K., Murty, M.N., Flynn, P.: Data Clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)CrossRefGoogle Scholar
  25. Kanungo, T., Mount, D.M., Netanyahu, N., Piatko, C., Silverman, R., Wu, A.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7), 881–892 (2002)CrossRefGoogle Scholar
  26. Katsavounidis, I., Jay Kuo, C.-C., Zhang, Z.: A new initialization technique for generalized Lloyd iteration. IEEE Signal Processing Letters 1(10), 144–146 (1994)CrossRefGoogle Scholar
  27. Kaufman, L., Rousseeuw, P.J.: Finding groups in data: an introduction to cluster analysis. John Wiley & Sons, New York (1990)Google Scholar
  28. Kohonen, T.: Clustering, Taxonomy, and Topological Maps of Patterns. In: Proceedings of the 6th International Conference on Pattern Recognition (1982)Google Scholar
  29. Kohonen, T.: Self-Organizing Maps. Springer, Heidelberg (2001)zbMATHGoogle Scholar
  30. MacQueen, J.: Some methods for classification and analysis of multivariate observation. In: 5th Berkeley Symposium on Mathematical Statistics and Probability. University of California Press, Berkeley (1967)Google Scholar
  31. Meila, M., Heckerman, D.: An Experimental Comparison of Several Clustering and Initialization Methods. Machine Learning 42, 9–29 (2001)zbMATHCrossRefGoogle Scholar
  32. Openshaw, S., Blake, M., Wymer, C.: Using neurocomputing methods to classify Britain’s residential areas. In: Fisher, P. (ed.) Innovations in GIS, vol. 2, pp. 97–111. Taylor and Francis, Abington (1995)Google Scholar
  33. Openshaw, S., Openshaw, C.: Artificial intelligence in geography. John Wiley & Sons, Chichester (1997)Google Scholar
  34. Openshaw, S., Wymer, C.: Classifying and regionalizing census data. In: Openshaw, S. (ed.) Census Users Handbook, Cambridge, UK. Geo Information International, pp. 239–270 (1994)Google Scholar
  35. Peña, J.M., Lozano, J.A., Larrañaga, P.: An empirical comparison of four initialization methods for the k-means algorithm. Pattern recognition letters 20, 1027–1040 (1999)CrossRefGoogle Scholar
  36. Plane, D.A., Rogerson, P.A.: The Geographical Analysis of Population: With Applications to Planning and Business. John Wiley & Sons, New York (1994)Google Scholar
  37. Roure, J., Talavera, L.: Robust incremental clustering with bad instance orderings: a new strategy. In: Coelho, H. (ed.) IBERAMIA 1998. LNCS (LNAI), vol. 1484, pp. 136–147. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  38. Sejnowski, T.J., Gorman, P.: Learned Classification of Sonar Targets Using a Massively Parallel Network. IEEE Transactions on Acoustics, Speech, and Signal Processing 36(7), 1135–1140 (1988)zbMATHCrossRefGoogle Scholar
  39. Selim, S.Z., Ismail, M.A.: k-means type algorithms: a generalized convergence theorem and characterization of local optimality. IEEE Trans. Pattern Analysis and Machine Intelligence 6, 81–87 (1984)zbMATHCrossRefGoogle Scholar
  40. Snarey, M., Terrett, N.K., Willett, P., Wilton, D.: Comparison of algorithms for dissimilarity-based compound selection. Journal of Molecular Graphics and Modelling 15(6), 372–385 (1997)CrossRefGoogle Scholar
  41. Thiesson, B., Meek, C., Chickering, D., Heckerman, D.: Computationally Efficient Methods for Selecting Among Mixtures of Graphical Models. In: Bernardo, J.M., Berger, J.O., Dawid, A.P., Smith, A.F.M. (eds.) Bayesian Statistics, p. 6. Oxford University Press, Oxford (1999)Google Scholar
  42. Tou, J., González, R.: Pattern Recognition Principals. Addison Wesley Publishing Company, Reading (1974)Google Scholar
  43. Waller, N.G., Kaiser, H.A., Illian, J., Manry, M.: A comparison of the classification capabilities of the 1-dimensional Kohonen neural network with two partitioning and three hierarchical cluster analysis algorithms. Psychometrika 63(1), 5–22 (1998)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Fernando Bação
    • 1
  • Victor Lobo
    • 1
    • 2
  • Marco Painho
    • 1
  1. 1.ISEGI/UNLLISBOAPortugal
  2. 2.Portuguese Naval AcademyALMADAPortugal

Personalised recommendations