Skip to main content
Log in

HFSMOOK-Means: An Improved K-Means Algorithm Using Hesitant Fuzzy Sets and Multi-objective Optimization

  • Research Article-Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

Clustering is considered as one of the important methods in data mining. The performance of the K-means algorithm, as one of the most common clustering methods, is high sensitivity to the initial cluster centers. Hence, selecting appropriate initial cluster centers for implementing the algorithm improves clustering resulted from the algorithm. The present study aims to find suitable initial cluster centers for the K-means. In fact, the initial cluster centers should be selected in such a way that clusters with high separation and high density can be obtained. Therefore, in this paper, finding initial cluster centers is considered as a multi-objective optimization problem through maximizing the distance between the initial cluster centers, as well as the neighbor density of the initial cluster centers. Solving the above problem through using the MOPSO algorithm provided a set of initial cluster centers of the candidate. Then, the hesitant fuzzy sets were used to evaluate the clusters generated from initial cluster centers by considering separation, cohesion and silhouette index. After that, the concept of informational energy of hesitant fuzzy sets is used, by which non-dominated particles in the Pareto optimal set were ranked and the initial cluster centers were selected for starting the K-means algorithm. The proposed HFSMOOK-means method was compared with several clustering algorithms by considering common and widely used criteria. The results indicated the successful performance of HFSMOOK-means in the majority of the datasets compared to the other algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Han, J.; Kamber, M.; Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann, Burlington (2012)

    MATH  Google Scholar 

  2. Jain, A.K.: Data clustering: 50 years beyond \(K\)-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)

    Article  Google Scholar 

  3. Coello, C.A.C.; Pulido, G.T.; Lechuga, M.S.: Handling multiple objectives with particle swarm optimization. IEEE Trans. Evol. Comput. 8(3), 256–279 (2004)

    Article  Google Scholar 

  4. Zadeh, L.A.: Fuzzy sets. Inf. Control 8(3), 338–353 (1965)

    Article  Google Scholar 

  5. Torra, V.: Hesitant fuzzy sets. Int. J. Intell. Syst. 25(6), 529–539 (2010)

    MATH  Google Scholar 

  6. Tan, P.N.; Steinbach, M.; Kumar, V.: Introduction to Data Mining. Pearson Addison Wesley, Boston (2005)

    Google Scholar 

  7. Arthur, D.; Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics (2007)

  8. Aggarwal, S.; Singh, P.: Cuckoo, Bat and Krill Herd based \(k\)-means++ clustering algorithms. Clust. Comput. 22, 14169–14180 (2019)

    Article  Google Scholar 

  9. Likas, A.; Vlassis, N.; Verbeek, J.J.: The global \(k\)-means clustering algorithm. Pattern Recogn. 36(2), 451–461 (2003)

    Article  Google Scholar 

  10. Tzortzis, G.; Likas, A.: The MinMax \(k\)-means clustering algorithm. Pattern Recogn. 47(7), 2505–2516 (2014)

    Article  Google Scholar 

  11. Wang, X.; Bai, Y.: The Global Minmax \(k\)-Means Algorithm, p. 1665. Springer, New York (2016)

    Google Scholar 

  12. Kushwaha, N.; Pant, M.; Kant, S.; Jain, V.K.: Magnetic optimization algorithm for data clustering. Pattern Recogn. Lett. 115, 59–65 (2018)

    Article  Google Scholar 

  13. Gu, X.; Angelov, P.; Zhao, Z.: A distance-type-insensitive clustering approach. Appl. Soft Comput. 77, 622–634 (2019)

    Article  Google Scholar 

  14. Salem, S.B.; Naouali, S.; Chtourou, Z.: A fast and effective partitional clustering algorithm for large categorical datasets using a \(k\)-means based approach. Comput. Electr. Eng. 68, 463–483 (2018)

    Article  Google Scholar 

  15. Zhao, Y.; Ming, Y.; Liu, X.; Zhu, E.; Zhao, K.; Yin, J.: Large-scale \(k\)-means clustering via variance reduction. Neurocomputing 307, 184–194 (2018)

    Article  Google Scholar 

  16. Majhi, S.K.; Biswal, S.: Optimal cluster analysis using hybrid \(K\)-means and ant lion optimizer. Karbala Int J Mod Sci 4(4), 347–360 (2018)

    Article  Google Scholar 

  17. Manochandar, S.; Punniyamoorthy, M.; Jeyachitra, R.: Development of new seed with modified validity measures for \(k\)-means clustering. Comput. Ind. Eng. 84, 106290 (2020)

    Article  Google Scholar 

  18. Peng, H.; Shi, P.; Wang, J.; Riscos-Núñez, A.; Pérez-Jiménez, M.J.: Multiobjective fuzzy clustering approach based on tissue-like membrane systems. Knowl. Based Syst. 125, 74–82 (2017)

    Article  Google Scholar 

  19. Deb, K.: Multi Objective Optimization Using Evolutionary Algorithms. Wiley, New York (2001)

    MATH  Google Scholar 

  20. Coello, C.A.C.; Lamont, G.B.: Applications of Multi-objective Evolutionary Algorithms, vol. 1. World Scientific, Singapore (2004)

    Book  Google Scholar 

  21. Shi, Y.; Eberhart, R.: A modified particle swarm optimizer. In: 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No. 98TH8360), pp. 69–73. IEEE (1998)

  22. Xia, M.; Xu, Z.: Hesitant fuzzy information aggregation in decision making. Int. J. Approx. Reason. 52(3), 395–407 (2011)

    Article  MathSciNet  Google Scholar 

  23. Chen, N.; Xu, Z.; Xia, M.: Correlation coefficients of hesitant fuzzy sets and their applications to clustering analysis. Appl. Math. Model. 37(4), 2197–2211 (2013)

    Article  MathSciNet  Google Scholar 

  24. Dua, D.; Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml

  25. Fränti, P.; Sieranoja, S.: \(K\)-means properties on six clustering benchmark datasets (2018). http://cs.uef.fi/sipu/datasets/

  26. Filho, T.M.S.; Pimentel, B.A.; Souza, R.M.; Oliveira, A.L.: Hybrid methods for fuzzy clustering based on fuzzy \(c\)-means and improved particle swarm optimization. Expert Syst. Appl. 42(17), 6315–6328 (2015)

    Article  Google Scholar 

  27. Vinh, N.X.; Epps, J.; Bailey, J.: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11(Oct), 2837–2854 (2010)

    MathSciNet  MATH  Google Scholar 

  28. Zainuddin, Z.; Pauline, O.: An effective fuzzy \(C\)-means algorithm based on symmetry similarity approach. Appl. Soft Comput. 35, 433–448 (2015)

    Article  Google Scholar 

  29. Malinen, M.I.; Mariescu-Istodor, R.; Fränti, P.: \(K\)-means: clustering by gradual data transformation. Pattern Recogn. 47(10), 3376–3386 (2014)

    Article  Google Scholar 

  30. Chen, S.; Xu, Z.; Tang, Y.: A hybrid clustering algorithm based on fuzzy \(C\)-means and improved particle swarm optimization. Arab. J. Sci. Eng. 39(12), 8875–8887 (2014)

    Article  MathSciNet  Google Scholar 

  31. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7(Jan), 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  32. Wang, H.; Wang, W.; Zhou, X.; Sun, H.; Zhao, J.; Yu, X.; Cui, Z.: Firefly algorithm with neighborhood attraction. Inf. Sci. 382, 374–387 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hassan Rezaei.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rezaei, K., Rezaei, H. HFSMOOK-Means: An Improved K-Means Algorithm Using Hesitant Fuzzy Sets and Multi-objective Optimization. Arab J Sci Eng 45, 6241–6257 (2020). https://doi.org/10.1007/s13369-020-04620-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-020-04620-5

Keywords

Navigation