Advertisement

Improving the Efficiency of the K-medoids Clustering Algorithm by Getting Initial Medoids

  • Joaquín Pérez-Ortega
  • Nelva N. Almanza-Ortega
  • Jessica Adams-López
  • Moisés González-Gárcia
  • Adriana Mexicano
  • Socorro Saenz-Sánchez
  • J. M. Rodríguez-Lelis
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 569)

Abstract

The conventional K-medoids algorithm is one of the most used clustering algorithms, however, one of its limitations is its sensitivity to initial medoids. The generation of optimized initial medoids, which increases the efficiency and effectiveness of K-medoids is proposed. The initial medoids are obtained in two steps, in the first one the data are grouped with an efficient variant of algorithm K-means denominated Early Classification. In the second step, the centroids generated by K-means are transformed into optimized initial medoids. The proposed approach was validated by solving a set of real data sets and compared with the K-medoids algorithm solution. Based on the obtained results it was determined that our approach reduced the time an average of 68%. The quality results of our approach were compared using several well-known validation indexes, and the values were very similar.

Keywords

Clustering K-medoids K-means Hybrid algorithm Efficiency 

References

  1. 1.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)CrossRefGoogle Scholar
  2. 2.
    Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. Neural Netw. 16(3), 645–678 (2005)CrossRefGoogle Scholar
  3. 3.
    Kaufman, L., Rousseeuw, P.J.: Clustering by means of Medoids. In: Dodge, Y. (ed.) Statistical Data Analysis Based on the L1–Norm and Related Methods, pp. 405–416. Elsevier, Berlin (1987)Google Scholar
  4. 4.
    Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New Jersey (2005)zbMATHGoogle Scholar
  5. 5.
    Ng, R.T., Han, J.: Efficient and effective clustering methods for spatial data mining. In: 20th Conference Very Large Databases, pp. 144–155 (1994)Google Scholar
  6. 6.
    Ng, R.T., Han, J.: CLARANS: a method for clustering objects for spatial data mining. IEEE Trans. Knowl. Data Eng. 14(5), 1003–1016 (2002)CrossRefGoogle Scholar
  7. 7.
    Paterlini, A.A., Nascimento, M.A., Caetano, T.J.: Using pivots to speed-up k-medoids clustering. J. Inf. Data Manag. 2(2), 221–236 (2011)Google Scholar
  8. 8.
    MacQueen, J.B.: Some methods for classifications and analysis of multivariate observations. In: Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)Google Scholar
  9. 9.
    Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Pérez, J., Pires, C.E., Balby, L., Mexicano, A., Hidalgo, M.: Early classification: a new heurístic to improve the classification step of the K-means. J. Inf. Data Manag. 4(2), 94–103 (2013)Google Scholar
  11. 11.
    Drias, H., Cherif, N.F., Kechid, A.: k-MM: a hybrid clustering algorithm based on k-Means and k-Medoids. In: Pillay, N., Engelbrecht, A.P., Abraham, A., du Plessis, M.C., Snášel, V., Muda, A.K. (eds.) Advances in Nature and Biologically Inspired Computing. AISC, vol. 419, pp. 37–48. Springer, Cham (2016). doi: 10.1007/978-3-319-27400-3_4 CrossRefGoogle Scholar
  12. 12.
    Park, H.S., Jun, C.H.: A simple and fast algorithm for K-medoids clustering. Expert Syst. Appl. 36(2), 3336–3341 (2009)CrossRefGoogle Scholar
  13. 13.
    Pardeshi, B., Toshniwal, D.: Improved K-medoids clustering based on cluster validity index and object density. In: IEEE 2nd International on Advance Computing Conference, pp. 379–384 (2010)Google Scholar
  14. 14.
    Barioni, M.C.N., Razente, H.L., Traina, A.J.M., Traina, C.J.: Accelerating k-medoid-based algorithms through metric access methods. J. Syst. Softw. 81, 343–355 (2008)CrossRefGoogle Scholar
  15. 15.
    Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)CrossRefGoogle Scholar
  16. 16.
    Vendramin, L., Campello, R.J.G.B., Hruschka, E.R.: Relative cluster validity criteria: a comparative overview. Stat. Anal. Data Min. 3(4), 209–235 (2010)MathSciNetGoogle Scholar
  17. 17.
    Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(193), 193–218 (1985)CrossRefzbMATHGoogle Scholar
  18. 18.
    Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2003)MathSciNetzbMATHGoogle Scholar
  19. 19.
    UCI Machine Learning Repository: http://archive.ics.uci.edu/ml

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Joaquín Pérez-Ortega
    • 1
  • Nelva N. Almanza-Ortega
    • 1
  • Jessica Adams-López
    • 1
  • Moisés González-Gárcia
    • 1
  • Adriana Mexicano
    • 1
  • Socorro Saenz-Sánchez
    • 1
  • J. M. Rodríguez-Lelis
    • 1
  1. 1.National Technological of Mexico/CENIDETCuernavacaMexico

Personalised recommendations