Skip to main content
Log in

A density-grid-based method for clustering k-dimensional data

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In this paper, we propose a novel density-grid-based method for clustering k-dimensional data. KIDS, an acronym for K-dimensional Ink Drop Spread, detects densely-connected pieces of data in k-dimensional grids. It enables one to simultaneously exploit the advantages of fuzzy logic, as well as both density-based and grid-based clustering. In the proposed method, the k-dimensional data space is divided into different cells. Input data records are mapped to the cells. The data points are then spread in the k-dimensional cells, just like what happens to ink drops in water. So the cells adjacent to the data cells also represent the data. Eventually, the impacts of all data grid cells are condensed and compared with the threshold to compute the final clusters. The experimental results show that the method has superior quality and efficiency in both low and high dimensions. In addition, the method is not only robust to noise but it is also capable of finding clusters of arbitrary shapes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. https://github.com/alamkashani/KIDS

  2. The probability of data being distributed randomly while we mistakenly consider the data as groups that form clusters.

References

  1. Aggarwal CC, Reddy CK (2014) Data clustering: algorithms and applications. Chapman and Hall/CRC Data Mining and Knowledge Discovery Series, CRC Press

  2. Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. The Morgan Kaufmann Series in Data Management Systems, Elsevier

  3. Berkhin P (2006) Grouping multidimensional data, 1st edn, Springer, Berlin, Heidelberg, chap a survey of clustering data mining techniques. https://doi.org/10.1007/3-540-28349-8_2

  4. Alam A, Muqeem M, Ahmad S (2021) Comprehensive review on clustering techniques and its application on high dimensional data. Int J Compt Sci Netw Secur 21(6):237–244

    Google Scholar 

  5. Sumathi A, Yasotha K, Nandhinidevi S (2021) High dimensional deep data clustering architecture towards evolving concept. Nat Volatiles Essent Oils 8(5):1695–1703

    Google Scholar 

  6. Boonchoo T, Ao X, Liu Y, Zhao W, Zhuang F, He Q (2019) Grid-based dbscan: Indexing and inference. Pattern Recogn 90:271–284

    Article  Google Scholar 

  7. Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16 (3):645–678

    Article  Google Scholar 

  8. Jia C, Tan C, Yong A (2008) A grid and density-Based clustering algorithm for processing data stream. In: Proc, the 2nd Int Conf Genetic and Evolutionary Computing, pp 517–521

  9. Tu L, Chen Y (2009) Stream data clustering based on grid density and attraction. ACM Transactions on Knowledge Discovery Data 3(3)

  10. Amini A, Wah T, Saboohi H (2014) On density-Based data streams clustering algorithms: a survey. J Comput Sci Technol 29(1):116–141

    Article  Google Scholar 

  11. Wan L, Ng WK, Dang XH, Yu PS, Zhang K (2009) Density-Based clustering of data streams at multiple resolutions. ACM Transactions on Knowledge Discovery Data 3(3)

  12. Ren J, Cai B, Hu C (2011) Clustering over data streams based on grid density and index tree. J Converg Inf Technol 6(1):83–93

    Google Scholar 

  13. Javadian M, Shouraki SB (2017) UALM: Unsupervised active learning method For clustering low-dimensional data. J Intell Fuzzy Syst 32:2393–2411

    Article  Google Scholar 

  14. Javadian M, Shouraki SB, Sheikhpour S (2017) A novel density-based fuzzy clustering algorithm for low dimensional feature space. Fuzzy Sets Syst 318:34–55

    Article  MathSciNet  Google Scholar 

  15. Ghasemi V, Javadian M, Bagheri Shouraki S (2020) High-dimensional unsupervised active learning method. J AI Data Mining 8(3):391–407

    Google Scholar 

  16. Ester M, Kriegel H, Sander J, Xu X (1996) A density-Based algorithm for discovering clusters in large spatial databases with noise. In: Proc 2nd International conference on knowledge discovery and data mining, pp 226–231

  17. Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) Optics: ordering points to identify the clustering structure. Proc ACM SIGMOD’99 Int Conf on Management of Data 28(2):49–60

    Article  Google Scholar 

  18. Hinneburg A, Keim DA (1998) An efficient approach to clustering in large multimedia databases with noise. Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining

  19. Bordogna G, Ienco D (2014) Fuzzy core dbscan clustering algorithm. Int Conf Infor Process Mgmt Uncertainty Knowledge-Based Syst 444:100–109

    MathSciNet  MATH  Google Scholar 

  20. Smiti A, Eloudi Z (2013) Soft DBSCAN: Improving DBSCAN clustering method using fuzzy set theory. 2013 6th International Conference on Human System Interactions (HSI)

  21. Wu B, Wilamowski BM (2017) A fast density and grid based clustering method for data with arbitrary shapes and noise. IEEE Trans Ind Infor 13(4):1620–1628

    Article  Google Scholar 

  22. Nayak J, Naik B, Behera H (2015) Computational Intelligence in Data Mining, vol 2, Springer, chap Fuzzy C-means (FCM) clustering algorithm:, a decade review from 2000 to 2014

  23. Nasibov EN, Ulutagay G (2007) A new unsupervised approach for fuzzy clustering. Fuzzy Sets Syst 158:2118–2133

    Article  MathSciNet  MATH  Google Scholar 

  24. Nasibov E, Ulutagay G (2008) FN-DBSCAN: A novel Density-Based clustering method with fuzzy neighborhood relations. In: Proceedings of 8th international conference application of fuzzy systems and soft computing (ICAFS-2008), pp 101–110

  25. Kriegel H, Pfeifle M (2005) Density-based Clustering of Uncertain Data. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp 672–677

  26. Wu L, Liu Z, Zang Z, Xia J, Li S, Li SZ (2020) Deep clustering and representation learning that preserves geometric structures. CoRR arXiv:2009.09590

  27. Rebuffi S, Ehrhardt S, Han K, Vedaldi A, Zisserman A (2020) LSD-C: linearly separable deep clusters. CoRR arXiv:2006.10039

  28. Ghazizadeh-Ahsaee M, Shamsadini-Farsangi A (2020) Developing of a new hybrid clustering algorithm based on density. 2020 6th International Conference on Web Research (ICWR). https://doi.org/10.1109/ICWR49608.2020.9122309

  29. Dong S, Liu J, Liu Y, Zeng L, Xu C, Zhou T (2018) Clustering based on grid and local density with priority-based expansion for multi-density data. Inf Sci 468:103–116

    Article  MATH  Google Scholar 

  30. Cai L, Wang H, Jiang F, Zhang Y, Peng Y (2022) A new clustering mining algorithm for multi-source imbalanced location data. Inf Sci 584:50–64

    Article  Google Scholar 

  31. Shouraki SB (2000) A novel fuzzy approach to modeling and control and its hardware implementation based on brain functionality and specifications. PhD thesis The University of Electro-Communications, Chofu, Japan

  32. Javadian M, Malekzadeh A, Heydari G, Shouraki SB (2020) A clustering fuzzification algorithm based on ALM. Fuzzy Sets Syst 389:93–113

    Article  MathSciNet  Google Scholar 

  33. Murakami M, Honda N (2007) A study on the modeling ability of the IDS method: a soft computing technique using pattern-based information processing. Int J Approx Reas 45:470–487

    Article  MATH  Google Scholar 

  34. Klidbary SH, Shouraki S (2018) Linares-barranco Digital hardware realization of a novel adaptive ink drop spread operator and its application in modeling and classification and on-chip training. International Journal of Machine Learning and Cybernetics

  35. Iranmehr E, Shouraki S, Faraji M, Bagheri N, Linares-Barranco B (2019) Bio-Inspired Evolutionary model of spiking neural networks in ionic liquid space. Frontiers in neuroscience, Neuromorphic Engineering

  36. Hosseini S (2019) Neutron spectroscopy with soft computing: unfolding of the neutron energy spectrum using the developed computer code based on Adaptive Group of Ink Drop Spread (AGIDS). Journal of Instrumentation

  37. Papoulis A, Pillai SU (2002) Probability, Random Variables and Stochastic Processes, 4th edn. McGraw-Hill Europe

  38. Kudelić R (2016) Monte-carlo randomized algorithm for minimal feedback arc set problem. Appl Soft Comput 41:235–246

    Article  Google Scholar 

  39. Zaki MJ, Jr WM (2014) Data Mining and analysis: fundamental concepts and algorithms. Cambridge University Press

  40. Balcan MF, Dick T, Liang Y, Mou W, Zhang H (2017) Differentially private clustering in high-dimensional Euclidean spaces. In: Proceedings of the 34th International conference on machine learning, vol 70, pp 322–331

  41. Blake CL, Merz CJ (1998) Uci repository of machine learning databases

  42. Shah S, Koltun V (2017) Robust continuous clustering. In: Proceedings of the National Academy of Sciences, vol 114, pp 9814–9819

  43. Fränti P, Virmajoki O, Hautamäki V (2006) Fast agglomerative clustering using a k-Nearest neighbor graph. IEEE Trans Pattern Anal Mach Intell 28(11):1875–1881

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Nasim Bagheri for her generous contribution to English editing and proofreading of the paper. This work was supported by the INFS (Iran National Science Foundation) Grant number 98011279.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elham S. Kashani.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kashani, E.S., Bagheri Shouraki, S., Norouzi, Y. et al. A density-grid-based method for clustering k-dimensional data. Appl Intell 53, 10559–10573 (2023). https://doi.org/10.1007/s10489-022-03711-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03711-0

Keywords

Navigation