Skip to main content
Log in

A new hybridization of DBSCAN and fuzzy earthworm optimization algorithm for data cube clustering

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Data aggregation from different databases into a data warehouse creates multidimensional data such as data cubes. With regard to the 3D structure of data, data cube clustering has significant challenges to perform on data cube. In this paper, new preprocessing techniques and a novel hybridization of DBSCAN and fuzzy earthworm optimization algorithm (EWOA) are proposed to solve the challenges. Proposed preprocessing consists of an assigned address to each cube cell and dimension move to create a related 2D data from the data cube and new similarity metric. The DBSCAN algorithm, as a density-based clustering algorithm, is adopted based on both Euclidean and newly proposed similarity metric, which are called DBSCAN1 and DBSCAN2 for the related 2D data. A new hybridization of the EWOA and DBSCAN is proposed to improve the DBSCAN, and it is called EWOA–DBSCAN. Also, to dynamically tune parameters of EWOA, a fuzzy logic controller is designed with two fuzzy group rules of Mamdani (EWOA–DBSCAN-Mamdani) and Sugeno (EWOA–DBSCAN-Sugeno), separately. These ideas are proposed to present efficient and flexible unsupervised analysis for a data cube by utilizing a meta-heuristic algorithm to optimize DBSCAN’s parameters and increasing the efficiency of the idea by applying dynamic tuning parameters of the algorithm. To evaluate the efficiency, the proposed algorithms are compared with DBSCAN1 and GA-DBSCAN1, GA-DBSCAN1-Mamdani and GA-DBSCAN1-Sugeno. The experimental results, consisting of 20 runs, indicate that the proposed ideas achieved their targets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Angelova M, Pencheva T (2011) Tuning genetic algorithm parameters to improve convergence time. Int J Chem Eng 2011:1–7

    Article  Google Scholar 

  • Aydilek IB, Arslan A (2013) A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm. Inf Sci 233:25–35

    Article  Google Scholar 

  • Berkhin P (2006) A survey of clustering data mining techniques. In: Kogan J, Nicholas C, Teboulle M (eds) Grouping multidimensional data. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-28349-8_2

    Chapter  Google Scholar 

  • Bezdek JC, Pal NR (1995) Cluster validation with generalized Dunn’s indices. In: Proceedings 1995 second New Zealand international two-stream conference on artificial neural networks and expert systems. IEEE

  • Carvalho DR, Freitas AA (2004) A hybrid decision tree/genetic algorithm method for data mining. Inf Sci 163(1):13–35

    Article  Google Scholar 

  • Ceci M, Cuzzocrea A, Malerba D (2015) Effectively and efficiently supporting roll-up and drill-down OLAP operations over continuous dimensions via hierarchical clustering. J Intell Inf Syst 44(3):309–333

    Article  Google Scholar 

  • Chaudhuri S, Dayal U (1997) An overview of data warehousing and OLAP technology. ACM Sigmod Rec 26(1):65–74

    Article  Google Scholar 

  • Chen J (2012) Hybrid clustering algorithm based on PSO with the multidimensional asynchronism and stochastic disturbance method. J Theor Appl Inf Technol 46(1):434–440

    Google Scholar 

  • Cheng T (2017) An improved DBSCAN clustering algorithm for multi-density datasets. In: Proceedings of the 2nd international conference on intelligent information processing. ACM

  • Darong H, Peng W (2012) Grid-based DBSCAN algorithm with referential parameters. Phys Procedia 24:1166–1170

    Article  Google Scholar 

  • Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 2:224–227

    Article  Google Scholar 

  • Freitas AA (2003) A survey of evolutionary algorithms for data mining and knowledge discovery. In: Ghosh A, Tsutsui S (eds) Advances in evolutionary computing. Natural Computing Series, Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18965-4_33

    Chapter  Google Scholar 

  • Gnanapriya S et al (2010) Data mining concepts and techniques. Data Min Knowl Eng 2(9):256–263

    Google Scholar 

  • Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam

    MATH  Google Scholar 

  • Hema R, Malik N (2010) Data mining and business intelligence. In: Proceedings of the 4th national conference

  • Herrera F, Lozano M (2003) Fuzzy adaptive genetic algorithms: design, taxonomy, and future directions. Soft Comput 7(8):545–562

    Article  Google Scholar 

  • Huang Z (1997) A fast clustering algorithm to cluster very large categorical data sets in data mining. DMKD 3(8):34–39

    Google Scholar 

  • Johnson RJ, Williams JP, Bauer KW (2013) AutoGAD: an improved ICA-based hyperspectral anomaly detection algorithm. IEEE Trans Geosci Remote Sens 51(6):3492–3503

    Article  Google Scholar 

  • Joshi A, Kaur R (2013) A review: comparative study of various clustering techniques in data mining. Int J Adv Res Comput Sci Softw Eng 3(3)

  • Karafotias G, Hoogendoorn M, Eiben ÁE (2015) Parameter control in evolutionary algorithms: trends and challenges. IEEE Trans Evol Comput 19(2):167–187

    Article  Google Scholar 

  • Karami A, Johansson R (2014) Choosing DBSCAN parameters automatically using differential evolution. Int J Comput Appl 91(7):1–11

    Google Scholar 

  • Kumar KM, Reddy ARM (2016) A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups method. Pattern Recognit 58:39–48

    Article  Google Scholar 

  • Liço L (2017) Data mining techniques in database systems

  • Liu J, Lampinen J (2005) A fuzzy adaptive differential evolution algorithm. Soft Comput 9(6):448–462

    Article  Google Scholar 

  • Mamdani EH, Assilian S (1975) An experiment in linguistic synthesis with a fuzzy logic controller. Int J Man Mach Stud 7(1):1–13

    Article  Google Scholar 

  • Mining WID (2006) Data mining: concepts and techniques. Morgan Kaufinann, Burlington

    Google Scholar 

  • Nagar P, Srivastava S (2008) Application of genetic algorithms in data mining. In: 2nd National conference on challenges and opportunities in information technology

  • Pei Z, Hua X, Han J (2008) The clustering algorithm based on particle swarm optimization algorithm. In: 2008 International conference on intelligent computation technology and automation (ICICTA). IEEE

  • Pujari AK (2001) Data mining techniques. Universities Press, Cambridge

    Google Scholar 

  • Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65

    Article  Google Scholar 

  • Scovanner P, Ali S, Shah M (2007) A 3-dimensional sift descriptor and its application to action recognition. In: Proceedings of the 15th ACM international conference on multimedia. ACM

  • Smiti A, Eloudi Z (2012) DBSCAN-GM: An improved clustering method based on Gaussian means and DBSCAN techniques. In: 2012 IEEE 16th International conference on intelligent engineering systems (INES). IEEE

  • Smiti A, Eloudi Z (2013) Soft DBSCAN: improving DBSCAN clustering method using fuzzy set theory. In: 2013 The 6th international conference on human system interaction (HSI). IEEE

  • Takagi T, Sugeno M (1993) Fuzzy identification of systems and its applications to modeling and control. In: Kozma R (ed) Readings in fuzzy sets for intelligent systems. IEEE Transactions on Systems, Man, and Cybernetics: Systems, pp 387–403

  • Vercellis C (2011) Business intelligence: data mining and optimization for decision making. Wiley, New York

    MATH  Google Scholar 

  • Wang G-G, Deb S, dos Santos Coelho L (2018) Earthworm optimisation algorithm: a bio-inspired metaheuristic algorithm for global optimisation problems. IJBIC 12(1):1–22

    Article  Google Scholar 

  • Woo HJ, Joo KH, Park NH (2015) A clustering OLAP analysis in a big data stream environment

  • Zhao Y-Q, Yang J (2015) Hyperspectral image denoising via sparse representation and low-rank constraint. IEEE Trans Geosci Remote Sens 53(1):296–308

    Article  Google Scholar 

  • Zhao B et al (2007) Image segmentation based on ant colony optimization and K-means clustering. In: 2007 IEEE International conference on automation and logistics. IEEE

Download references

Funding

The study is not funded by any agency.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Majid Abdolrazzagh-Nezhad.

Ethics declarations

Conflict of interest

The authors do hereby declare that there is no conflict of interests of other works regarding the publication of this paper.

Ethical approval

The manuscript does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

See Tables 6, 7, 8, 9, 10, 11, 12 and 13.

Table 6 Details of the experimental results for 20 runs of the DBSCAN1
Table 7 Details of the experimental results for 20 runs of the GA-DBSCAN1
Table 8 Details of the experimental results for 20 runs of the GA-DBSCAN1-Mamdani
Table 9 Details of the experimental results for 20 runs of the GA-DBSCAN1-Sugeno
Table 10 Details of the experimental results for 20 runs of the DBSCAN2
Table 11 Details of the experimental results for 20 runs of the EWOA–DBSCAN2
Table 12 Details of the experimental results for 20 runs of the EWOA–DBSCAN2-Mamdani
Table 13 Details of the experimental results for 20 runs of the EWOA–NDBSCAN2-Sugeno

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hosseini Rad, M., Abdolrazzagh-Nezhad, M. A new hybridization of DBSCAN and fuzzy earthworm optimization algorithm for data cube clustering. Soft Comput 24, 15529–15549 (2020). https://doi.org/10.1007/s00500-020-04881-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-020-04881-0

Keywords

Navigation