Abstract
K-means clustering algorithm is rich in literature and its success stems from simplicity and computational efficiency. The key limitation of K-means is that its convergence depends on the initial partition. Improper selection of initial centroids may lead to poor results. This paper proposes a method known as Deterministic Initialization using Constrained Recursive Bi-partitioning (DICRB) for the careful selection of initial centers. First, a set of probable centers are identified using recursive binary partitioning. Then, the initial centers for K-means algorithm are determined by applying a graph clustering on the probable centers. Experimental results demonstrate the efficacy and deterministic nature of the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)
Han, J., Kamber, M.: Data Mining, Southeast Asia Edition: Concepts and Techniques. Morgan Kaufmann, Los Altos (2006)
Xu, R., Wunsch, D., et al.: Survey of clustering algorithms. IEEE Trans. Neural Networks 16(3), 645–678 (2005)
Celebi, M.E., Kingravi, H.A., Vela, P.A.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst. Appl. 40(1), 200–210 (2013)
Bradley, P.S., Fayyad, U.M.: Refining initial points for k-means clustering. ICML 98, 91–99 (1998)
Erisoglu, M., Calis, N., Sakallioglu, S.: A new algorithm for initial cluster centers in k-means algorithm. Pattern Recogn. Lett. 32(14), 1701–1705 (2011)
Arthur, D., Vassilvitskii, S.: K-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035 (2007)
Likas, A., Vlassis, N., Verbeek, J.J.: The global k-means clustering algorithm. Pattern Recogn. 36(2), 451–461 (2003)
Ting, S., Jennifer, D.G.: In search of deterministic methods for initializing k-means and gaussian mixture clustering. Intell. Data Anal. 11(4), 319–338 (2007)
Tzortzis, G., Likas, A.: The minmax k-means clustering algorithm. Pattern Recogn. 47(7), 2505–2516 (2014)
Cao, F., Liang, J., Jiang, G.: An initialization method for the k-means algorithm using neighborhood model. Comput. Math. Appl. 58(3), 474–483 (2009)
Liu, M., Jiang, X., Kot, A.C.: A multi-prototype clustering algorithm. Pattern Recogn. 42(5), 689–698 (2009)
Chavent, M., Lechevallier, Y., Briant, O.: DIVCLUS-T: a monothetic divisive hierarchical clustering method. Comput. Stat. Data Anal. 52(2), 687–701 (2007)
Zahn, C.T.: Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans. Comput. 100(1), 68–86 (1971)
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. J. Intell. Inf. Syst. 17(2), 107–145 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer India
About this paper
Cite this paper
Jothi, R., Mohanty, S.K., Ojha, A. (2016). On Careful Selection of Initial Centers for K-means Algorithm. In: Nagar, A., Mohapatra, D., Chaki, N. (eds) Proceedings of 3rd International Conference on Advanced Computing, Networking and Informatics. Smart Innovation, Systems and Technologies, vol 43. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2538-6_45
Download citation
DOI: https://doi.org/10.1007/978-81-322-2538-6_45
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2537-9
Online ISBN: 978-81-322-2538-6
eBook Packages: EngineeringEngineering (R0)