Advertisement

On Careful Selection of Initial Centers for K-means Algorithm

Conference paper
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 43)

Abstract

K-means clustering algorithm is rich in literature and its success stems from simplicity and computational efficiency. The key limitation of K-means is that its convergence depends on the initial partition. Improper selection of initial centroids may lead to poor results. This paper proposes a method known as Deterministic Initialization using Constrained Recursive Bi-partitioning (DICRB) for the careful selection of initial centers. First, a set of probable centers are identified using recursive binary partitioning. Then, the initial centers for K-means algorithm are determined by applying a graph clustering on the probable centers. Experimental results demonstrate the efficacy and deterministic nature of the proposed method.

Keywords

Clustering K-means algorithm Initialization Bi-partitioning 

References

  1. 1.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)CrossRefGoogle Scholar
  2. 2.
    Han, J., Kamber, M.: Data Mining, Southeast Asia Edition: Concepts and Techniques. Morgan Kaufmann, Los Altos (2006)Google Scholar
  3. 3.
    Xu, R., Wunsch, D., et al.: Survey of clustering algorithms. IEEE Trans. Neural Networks 16(3), 645–678 (2005)CrossRefGoogle Scholar
  4. 4.
    Celebi, M.E., Kingravi, H.A., Vela, P.A.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst. Appl. 40(1), 200–210 (2013)CrossRefGoogle Scholar
  5. 5.
    Bradley, P.S., Fayyad, U.M.: Refining initial points for k-means clustering. ICML 98, 91–99 (1998)Google Scholar
  6. 6.
    Erisoglu, M., Calis, N., Sakallioglu, S.: A new algorithm for initial cluster centers in k-means algorithm. Pattern Recogn. Lett. 32(14), 1701–1705 (2011)CrossRefGoogle Scholar
  7. 7.
    Arthur, D., Vassilvitskii, S.: K-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035 (2007)Google Scholar
  8. 8.
    Likas, A., Vlassis, N., Verbeek, J.J.: The global k-means clustering algorithm. Pattern Recogn. 36(2), 451–461 (2003)CrossRefGoogle Scholar
  9. 9.
    Ting, S., Jennifer, D.G.: In search of deterministic methods for initializing k-means and gaussian mixture clustering. Intell. Data Anal. 11(4), 319–338 (2007)Google Scholar
  10. 10.
    Tzortzis, G., Likas, A.: The minmax k-means clustering algorithm. Pattern Recogn. 47(7), 2505–2516 (2014)CrossRefGoogle Scholar
  11. 11.
    Cao, F., Liang, J., Jiang, G.: An initialization method for the k-means algorithm using neighborhood model. Comput. Math. Appl. 58(3), 474–483 (2009)MATHMathSciNetCrossRefGoogle Scholar
  12. 12.
    Liu, M., Jiang, X., Kot, A.C.: A multi-prototype clustering algorithm. Pattern Recogn. 42(5), 689–698 (2009)MATHCrossRefGoogle Scholar
  13. 13.
    Chavent, M., Lechevallier, Y., Briant, O.: DIVCLUS-T: a monothetic divisive hierarchical clustering method. Comput. Stat. Data Anal. 52(2), 687–701 (2007)MATHMathSciNetCrossRefGoogle Scholar
  14. 14.
    Zahn, C.T.: Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans. Comput. 100(1), 68–86 (1971) CrossRefGoogle Scholar
  15. 15.
    Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. J. Intell. Inf. Syst. 17(2), 107–145 (2001) Google Scholar

Copyright information

© Springer India 2016

Authors and Affiliations

  • R. Jothi
    • 1
  • Sraban Kumar Mohanty
    • 1
  • Aparajita Ojha
    • 1
  1. 1.Indian Institute of Information TechnologyDesign and Manufacturing JabalpurJabalpurIndia

Personalised recommendations