A Fast Approximated k-Median Algorithm

  • Eva Gómez-Ballester
  • Luisa Micó
  • Jose Oncina
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2396)

Abstract

The k- means algorithm is a well-known clustering method. Although this technique was initially defined for a vector representation of the data, the set median (the point belonging to a set P that minimizes the sum of distances to the rest of points in P) can be used instead of the mean when this vectorial representation is not possible. The computational cost of the set median is O(|P| 2). Recently, a new method to obtain an approximated median in O(|P|) was proposed. In this paper we use this approximated median in the k-median algorithm to speed it up.

Keywords

Cost Function Synthetic Data Vector Representation Edit Distance Handwritten Digit 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Bradley, P. S., Fayyad, U. M.: Refining Initial Points for K-Means Clustering. Proc. 15th International Conf. on Machine Learning (1998).Google Scholar
  2. 2.
    Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley (2001).Google Scholar
  3. 3.
    Fu, K. S.: Syntactic Pattern Recognition and Applications. Prentice-Hall, Engle-wood Cliffs, NJ (1982).MATHGoogle Scholar
  4. 4.
    de la Higuera, C., Casacuberta, F.: The topology of strings: two NP-complete problems. Theoretical Computer Science 230 39–48 (2000).MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Jain, A. K., Dubes, R. C.: Algorithms for clustering data. Prentice-Hall (1988).Google Scholar
  6. 6.
    Martínez, C., Juan, A., Casacuberta, F.: Improving classification using median string and nn rules. In: Proceedings of IX Simposium Nacional de Reconocimiento de Formas y Análisis de Imágenes, 391–394 (2001).Google Scholar
  7. 7.
    Micó, L., Oncina, J.: An approximate median search algorithm in non-metric spaces. Pattern Recognition Letters 22 1145–1151 (2001).MATHCrossRefGoogle Scholar
  8. 8.
    Peña, J. M., Lozano, J. A., Larrañaga, P.: An empirical comparison of four initialization methods for the K-means algorithm. Pattern Recognition Letters 20 1027–1040 (1999).CrossRefGoogle Scholar
  9. 9.
    Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Academic Press (1999).Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Eva Gómez-Ballester
    • 1
  • Luisa Micó
    • 1
  • Jose Oncina
    • 1
  1. 1.Departamento de Lenguajes y Sistemas InformáticosUniversidad de AlicanteSpain

Personalised recommendations