Advertisement

Incremental k-Means Method

  • Rabinder Kumar Prasad
  • Rosy SarmahEmail author
  • Subrata ChakrabortyEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11941)

Abstract

In the last few decades, k-means has evolved as one of the most prominent data analysis method used by the researchers. However, proper selection of k number of centroids is essential for acquiring a good quality of clusters which is difficult to ascertain when the value of k is high. To overcome the initialization problem of k-means method, we propose an incremental k-means clustering method that improves the quality of the clusters in terms of reducing the Sum of Squared Error (\(SSE_{total}\)). Comprehensive experimentation in comparison to traditional k-means and its newer versions is performed to evaluate the performance of the proposed method on synthetically generated datasets and some real-world datasets. Our experiments shows that the proposed method gives a much better result when compared to its counterparts.

Keywords

k-means Sum of squared error Improving results 

References

  1. 1.
    Al-Daoud, M.B.: A new algorithm for cluster initialization. In: WEC 2005: The Second World Enformatika Conference (2005)Google Scholar
  2. 2.
    Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics (2007)Google Scholar
  3. 3.
    Bradley, P.S., Fayyad, U.M.: Refining initial points for k-means clustering. In: ICML, vol. 98, pp. 91–99. Citeseer (1998)Google Scholar
  4. 4.
    Celebi, M.E., Kingravi, H.A., Vela, P.A.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst. Appl. 40(1), 200–210 (2013)CrossRefGoogle Scholar
  5. 5.
    Fränti, P., Sieranoja, S.: K-means properties on six clustering benchmark datasets (2018). http://cs.uef.fi/sipu/datasets/CrossRefGoogle Scholar
  6. 6.
    Gonzalez, T.F.: Clustering to minimize the maximum intercluster distance. Theoret. Comput. Sci. 38, 293–306 (1985)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Hadi, A.S., Kaufman, L., Rousseeuw, P.J.: Finding groups in data: an introduction to cluster analysis. Technometrics 34(1), 111 (1992)CrossRefGoogle Scholar
  8. 8.
    Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, Amsterdam (2011)zbMATHGoogle Scholar
  9. 9.
    Ismkhan, H.: Ik-means-+: an iterative clustering algorithm based on an enhanced version of the k-means. Pattern Recogn. 79, 402–413 (2018)CrossRefGoogle Scholar
  10. 10.
    Kumar, K.M., Reddy, A.R.M.: An efficient k-means clustering filtering algorithm using density based initial cluster centers. Inf. Sci. 418, 286–301 (2017)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Redmond, S.J., Heneghan, C.: A method for initialising the K-means clustering algorithm using kd-trees. Pattern Recogn. Lett. 28(8), 965–973 (2007)CrossRefGoogle Scholar
  12. 12.
    Tzortzis, G., Likas, A.: The MinMax k-means clustering algorithm. Pattern Recogn. 47(7), 2505–2516 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of CSEDibrugarh UniversityDibrugarhIndia
  2. 2.Department of CSETezpur UniversityTezpurIndia
  3. 3.Department of StatisticsDibrugarh UniversityDibrugarhIndia

Personalised recommendations