Advertisement

Incremental Algorithm Based on Split Technique

  • Chedi OunaliEmail author
  • Fahmi Ben RejabEmail author
  • Kaouther Nouira FerchichiEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 941)

Abstract

Most clustering algorithms become ineffective when provided with unsuitable parameters or applied to data-sets which are composed of clusters with diverse shapes, sizes, and densities.

In our paper we present a new version of k-means method, that allows adding one new cluster to the k cluster we already had with out retraining from scratch. This method is based on the splitting process, we are looking for the cluster that had the highest score to be split, our score is based on three criteria; SSE, Dispersion-index and the size of cluster. Finally, the split process is performed by using standard K. Experimental results demonstrate the effectiveness of our approach both on simulated and real data-sets.

Keywords

Clustering Clusters K-Means Incremental K-Means SSE Split Dispersion-index 

References

  1. 1.
    Clarke, K.R., Chapman, M.G., Somerfield, P.J., Needham, H.R.: Dispersion-based weighting of species counts in assemblage analysesGoogle Scholar
  2. 2.
    Yadav, A., Dhingra, S.: A review on K-means clustering technique. Int. J. Latest Res. Sci. Technol. 5(4), 13–16 (2016)Google Scholar
  3. 3.
    Zhou, P.Y., Chan, K.C.C.: A model-based multivariate time series clustering algorithm. In: Peng, W.-C., et al. (eds.) PAKDD 2014 Workshops. LNAI, vol. 8643, pp. 805–817. Springer, Cham (2014)Google Scholar
  4. 4.
    Dalatu, P.I., Fitrianto, A., Mustapha, A.: Hybrid distance functions for K-means clustering algorithms. Stat. J. IAOS 33, 989–996 (2017)CrossRefGoogle Scholar
  5. 5.
    Strauss, T., Von Maltitz, M.J.: Generalising ward’s method for use with Manhattan distances. PLoS One 12(1), e0168288 (2017)CrossRefGoogle Scholar
  6. 6.
    Surya Prasath, V.B., Alfeilat, H.A.A., Lasassmeh, O., Hassanat, A.B.A.: Distance and similarity measures effect on the performance of k-nearest neighbor classifier a review. Preprint submitted to Elsevier, 16 August 2017Google Scholar
  7. 7.
    Li, X., Han, Q., Qiu, B.: A clustering algorithm with affine space-based boundary detection. Appl. Intell. 2, 1–13 (2017)Google Scholar
  8. 8.
    Tong, Q., Li, X., Yuan, B.: A highly scalable clustering scheme using boundary information. Pattern Recognit. Lett. 89, 1–7 (2017)CrossRefGoogle Scholar
  9. 9.
    Patil, R.R., Khan, A.: Bisecting K-means for clustering web log data. Int. J. Comput. Appl. 116(19), 36–41 (2015)Google Scholar
  10. 10.
    Capó, M., Pérez, A., Lozano, J.A.: An efficient K-means clustering algorithm for massive data. J. Latex Class Files 14(8) (2015)Google Scholar
  11. 11.
    Han, J., Kamber, M.: Data Mining Concepts and Techniques, 2nd edn. Morgan Kaufmann Publishers, Burlington (2006). Fast kernel classifiers with online and active learning 6, 1579–1619zbMATHGoogle Scholar
  12. 12.
    Mall, R., Ahmad, M., Lamirel, J.: Comportement comparatif des methodes de clustering incrmentales et non incrmentales sur les donnes textuelles htrogenes (2014)Google Scholar
  13. 13.
    Bao, J., Wang, W., Yang, T., Wu, G.: An incremental clustering method based on the boundary profile. PLoS One 13(4) (2018)Google Scholar
  14. 14.
    Zhang, Y., Li, K., Gu, H., Yang, D.: Adaptive split-and-merge clustering algorithm for wireless sensor networks. In: International Workshop on Information and Electronics Engineering (IWIEE) (2012)Google Scholar
  15. 15.
    Savaresi, M., Boley, D., Bittanti, S., Gazzaniga, G.: Choosing the cluster to split in bisecting divisive clustering algorithmsGoogle Scholar
  16. 16.
    Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall Advance Reference Series. Prentice-Hall, Upper Saddle Rive (1988)zbMATHGoogle Scholar
  17. 17.
    Thinsungnoena, T., Kaoungkub, N., Durongdumronchaib, P., Kerdprasopb, K., Kerdprasopb, N.: The clustering validity with silhouette and sum of squared errors. In: Proceedings of the 3rd International Conference on Industrial Application Engineering (2015)Google Scholar
  18. 18.
    Bache, K., Lichman, M.: (UCI) machine learning repository (2013) http://archive.ics.uci.edu/ml
  19. 19.
    Rijn, J.V. (2014). https://www.openml.org/d/268
  20. 20.
    Brahmi, P.I., Ben Yahia, S.: Detection des anomalies base sur le clustering (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.ISGT, LR99ES04 BESTMODUniversité de TunisLe BardoTunisia

Personalised recommendations