A density-grid-based method for clustering k-dimensional data

Kashani, Elham S.; Bagheri Shouraki, Saeed; Norouzi, Yaser; De Baets, Bernard

doi:10.1007/s10489-022-03711-0

A density-grid-based method for clustering k-dimensional data

Published: 20 August 2022

Volume 53, pages 10559–10573, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Elham S. Kashani¹,
Saeed Bagheri Shouraki¹,
Yaser Norouzi² &
…
Bernard De Baets³

410 Accesses
1 Altmetric
Explore all metrics

Abstract

In this paper, we propose a novel density-grid-based method for clustering k-dimensional data. KIDS, an acronym for K-dimensional Ink Drop Spread, detects densely-connected pieces of data in k-dimensional grids. It enables one to simultaneously exploit the advantages of fuzzy logic, as well as both density-based and grid-based clustering. In the proposed method, the k-dimensional data space is divided into different cells. Input data records are mapped to the cells. The data points are then spread in the k-dimensional cells, just like what happens to ink drops in water. So the cells adjacent to the data cells also represent the data. Eventually, the impacts of all data grid cells are condensed and compared with the threshold to compute the final clusters. The experimental results show that the method has superior quality and efficiency in both low and high dimensions. In addition, the method is not only robust to noise but it is also capable of finding clusters of arbitrary shapes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A New Accurate Clustering Approach for Detecting Different Densities in High Dimensional Data

Clustering by Searching Density Peaks via Local Standard Deviation

A Distance Scaling Method to Improve Density-Based Clustering

Notes

https://github.com/alamkashani/KIDS
The probability of data being distributed randomly while we mistakenly consider the data as groups that form clusters.

References

Aggarwal CC, Reddy CK (2014) Data clustering: algorithms and applications. Chapman and Hall/CRC Data Mining and Knowledge Discovery Series, CRC Press
Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. The Morgan Kaufmann Series in Data Management Systems, Elsevier
Berkhin P (2006) Grouping multidimensional data, 1st edn, Springer, Berlin, Heidelberg, chap a survey of clustering data mining techniques. https://doi.org/10.1007/3-540-28349-8_2
Alam A, Muqeem M, Ahmad S (2021) Comprehensive review on clustering techniques and its application on high dimensional data. Int J Compt Sci Netw Secur 21(6):237–244
Google Scholar
Sumathi A, Yasotha K, Nandhinidevi S (2021) High dimensional deep data clustering architecture towards evolving concept. Nat Volatiles Essent Oils 8(5):1695–1703
Google Scholar
Boonchoo T, Ao X, Liu Y, Zhao W, Zhuang F, He Q (2019) Grid-based dbscan: Indexing and inference. Pattern Recogn 90:271–284
Article Google Scholar
Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16 (3):645–678
Article Google Scholar
Jia C, Tan C, Yong A (2008) A grid and density-Based clustering algorithm for processing data stream. In: Proc, the 2nd Int Conf Genetic and Evolutionary Computing, pp 517–521
Tu L, Chen Y (2009) Stream data clustering based on grid density and attraction. ACM Transactions on Knowledge Discovery Data 3(3)
Amini A, Wah T, Saboohi H (2014) On density-Based data streams clustering algorithms: a survey. J Comput Sci Technol 29(1):116–141
Article Google Scholar
Wan L, Ng WK, Dang XH, Yu PS, Zhang K (2009) Density-Based clustering of data streams at multiple resolutions. ACM Transactions on Knowledge Discovery Data 3(3)
Ren J, Cai B, Hu C (2011) Clustering over data streams based on grid density and index tree. J Converg Inf Technol 6(1):83–93
Google Scholar
Javadian M, Shouraki SB (2017) UALM: Unsupervised active learning method For clustering low-dimensional data. J Intell Fuzzy Syst 32:2393–2411
Article Google Scholar
Javadian M, Shouraki SB, Sheikhpour S (2017) A novel density-based fuzzy clustering algorithm for low dimensional feature space. Fuzzy Sets Syst 318:34–55
Article MathSciNet Google Scholar
Ghasemi V, Javadian M, Bagheri Shouraki S (2020) High-dimensional unsupervised active learning method. J AI Data Mining 8(3):391–407
Google Scholar
Ester M, Kriegel H, Sander J, Xu X (1996) A density-Based algorithm for discovering clusters in large spatial databases with noise. In: Proc 2nd International conference on knowledge discovery and data mining, pp 226–231
Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) Optics: ordering points to identify the clustering structure. Proc ACM SIGMOD’99 Int Conf on Management of Data 28(2):49–60
Article Google Scholar
Hinneburg A, Keim DA (1998) An efficient approach to clustering in large multimedia databases with noise. Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining
Bordogna G, Ienco D (2014) Fuzzy core dbscan clustering algorithm. Int Conf Infor Process Mgmt Uncertainty Knowledge-Based Syst 444:100–109
MathSciNet MATH Google Scholar
Smiti A, Eloudi Z (2013) Soft DBSCAN: Improving DBSCAN clustering method using fuzzy set theory. 2013 6th International Conference on Human System Interactions (HSI)
Wu B, Wilamowski BM (2017) A fast density and grid based clustering method for data with arbitrary shapes and noise. IEEE Trans Ind Infor 13(4):1620–1628
Article Google Scholar
Nayak J, Naik B, Behera H (2015) Computational Intelligence in Data Mining, vol 2, Springer, chap Fuzzy C-means (FCM) clustering algorithm:, a decade review from 2000 to 2014
Nasibov EN, Ulutagay G (2007) A new unsupervised approach for fuzzy clustering. Fuzzy Sets Syst 158:2118–2133
Article MathSciNet MATH Google Scholar
Nasibov E, Ulutagay G (2008) FN-DBSCAN: A novel Density-Based clustering method with fuzzy neighborhood relations. In: Proceedings of 8th international conference application of fuzzy systems and soft computing (ICAFS-2008), pp 101–110
Kriegel H, Pfeifle M (2005) Density-based Clustering of Uncertain Data. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp 672–677
Wu L, Liu Z, Zang Z, Xia J, Li S, Li SZ (2020) Deep clustering and representation learning that preserves geometric structures. CoRR arXiv:2009.09590
Rebuffi S, Ehrhardt S, Han K, Vedaldi A, Zisserman A (2020) LSD-C: linearly separable deep clusters. CoRR arXiv:2006.10039
Ghazizadeh-Ahsaee M, Shamsadini-Farsangi A (2020) Developing of a new hybrid clustering algorithm based on density. 2020 6th International Conference on Web Research (ICWR). https://doi.org/10.1109/ICWR49608.2020.9122309
Dong S, Liu J, Liu Y, Zeng L, Xu C, Zhou T (2018) Clustering based on grid and local density with priority-based expansion for multi-density data. Inf Sci 468:103–116
Article MATH Google Scholar
Cai L, Wang H, Jiang F, Zhang Y, Peng Y (2022) A new clustering mining algorithm for multi-source imbalanced location data. Inf Sci 584:50–64
Article Google Scholar
Shouraki SB (2000) A novel fuzzy approach to modeling and control and its hardware implementation based on brain functionality and specifications. PhD thesis The University of Electro-Communications, Chofu, Japan
Javadian M, Malekzadeh A, Heydari G, Shouraki SB (2020) A clustering fuzzification algorithm based on ALM. Fuzzy Sets Syst 389:93–113
Article MathSciNet Google Scholar
Murakami M, Honda N (2007) A study on the modeling ability of the IDS method: a soft computing technique using pattern-based information processing. Int J Approx Reas 45:470–487
Article MATH Google Scholar
Klidbary SH, Shouraki S (2018) Linares-barranco Digital hardware realization of a novel adaptive ink drop spread operator and its application in modeling and classification and on-chip training. International Journal of Machine Learning and Cybernetics
Iranmehr E, Shouraki S, Faraji M, Bagheri N, Linares-Barranco B (2019) Bio-Inspired Evolutionary model of spiking neural networks in ionic liquid space. Frontiers in neuroscience, Neuromorphic Engineering
Hosseini S (2019) Neutron spectroscopy with soft computing: unfolding of the neutron energy spectrum using the developed computer code based on Adaptive Group of Ink Drop Spread (AGIDS). Journal of Instrumentation
Papoulis A, Pillai SU (2002) Probability, Random Variables and Stochastic Processes, 4th edn. McGraw-Hill Europe
Kudelić R (2016) Monte-carlo randomized algorithm for minimal feedback arc set problem. Appl Soft Comput 41:235–246
Article Google Scholar
Zaki MJ, Jr WM (2014) Data Mining and analysis: fundamental concepts and algorithms. Cambridge University Press
Balcan MF, Dick T, Liang Y, Mou W, Zhang H (2017) Differentially private clustering in high-dimensional Euclidean spaces. In: Proceedings of the 34th International conference on machine learning, vol 70, pp 322–331
Blake CL, Merz CJ (1998) Uci repository of machine learning databases
Shah S, Koltun V (2017) Robust continuous clustering. In: Proceedings of the National Academy of Sciences, vol 114, pp 9814–9819
Fränti P, Virmajoki O, Hautamäki V (2006) Fast agglomerative clustering using a k-Nearest neighbor graph. IEEE Trans Pattern Anal Mach Intell 28(11):1875–1881
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank Nasim Bagheri for her generous contribution to English editing and proofreading of the paper. This work was supported by the INFS (Iran National Science Foundation) Grant number 98011279.

Author information

Authors and Affiliations

Electrical Engineering Department, Sharif University of Technology, Tehran, Iran
Elham S. Kashani & Saeed Bagheri Shouraki
Electrical Engineering Department, Amirkabir University of Technology, Tehran, Iran
Yaser Norouzi
KERMIT, Department of Data Analysis and Mathematical Modeling, Ghent University, Ghent, Belgium
Bernard De Baets

Authors

Elham S. Kashani
View author publications
You can also search for this author in PubMed Google Scholar
Saeed Bagheri Shouraki
View author publications
You can also search for this author in PubMed Google Scholar
Yaser Norouzi
View author publications
You can also search for this author in PubMed Google Scholar
Bernard De Baets
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elham S. Kashani.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kashani, E.S., Bagheri Shouraki, S., Norouzi, Y. et al. A density-grid-based method for clustering k-dimensional data. Appl Intell 53, 10559–10573 (2023). https://doi.org/10.1007/s10489-022-03711-0

Download citation

Accepted: 29 April 2022
Published: 20 August 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s10489-022-03711-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A density-grid-based method for clustering k-dimensional data

Abstract

Access this article

Similar content being viewed by others

A New Accurate Clustering Approach for Detecting Different Densities in High Dimensional Data

Clustering by Searching Density Peaks via Local Standard Deviation

A Distance Scaling Method to Improve Density-Based Clustering

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A density-grid-based method for clustering k-dimensional data

Abstract

Access this article

Similar content being viewed by others

A New Accurate Clustering Approach for Detecting Different Densities in High Dimensional Data

Clustering by Searching Density Peaks via Local Standard Deviation

A Distance Scaling Method to Improve Density-Based Clustering

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation