Advertisement

A new local density and relative distance based spectrum clustering

  • Mingzhe Liu
  • Mingfu He
  • Ruili WangEmail author
  • Shaoda Li
Regular Paper
  • 35 Downloads

Abstract

A novel local density and relative distance-based spectrum clustering (LDRDSC) algorithm is proposed for multidimensional data clustering. The density spectra consider both redefined local densities and relative distances. The spectral peaks are defined as cluster centers since these peaks correspond to the local density maximums. Different clusters correspond to different spectra. The clustering by fast search and find of density peaks (CFSFDP) algorithm and several benchmark data sets are employed to validate our proposed LDRDSC algorithm. Once the density spectrum is generated, the rest points can be automatically clustered by our LDRDSC algorithm, which is different from CFSFDP. CFSFDP needs to categorize data points according to the cluster centers. Furthermore, our LDRDSC algorithm is compared with other five typical clustering algorithms (DBSCAN, FCM, AP, Mean Shift and k-means) in order to validate the effectiveness of the proposed algorithm. Computational results demonstrate that our algorithm can obtain a better clustering result than the above mentioned algorithms, especially in identifying noises or isolates.

Keywords

Clustering Local density Relative distance DBSCAN 

Notes

Acknowledgements

This work was supported by the National Natural Foundation of Science, China (41274109), the Innovative Team Project of Sichuan Province (2015TD0020) and the New Zealand Marsden Fund.

References

  1. 1.
    Frigui H, Krishnapuram R (1999) A robust competitive clustering algorithm with applications in computer vision. IEEE Trans Pattern Anal Mach Intell 21(5):450–465CrossRefGoogle Scholar
  2. 2.
    Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Sufisstrunk S (2012) Sliced superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282CrossRefGoogle Scholar
  3. 3.
    Elhamifar E, Vidal R (2009) Sparse subspace clustering. In: IEEE conference on computer vision and pattern recognition, CVPR, pp 2790–2797Google Scholar
  4. 4.
    Li W, Godzik A (2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22(13):1658–1659CrossRefGoogle Scholar
  5. 5.
    King AD, Prulj N, Jurisica I (2004) Protein complex prediction via cost-based clustering. Bioinformatics 20(17):3013–3020CrossRefGoogle Scholar
  6. 6.
    Huang DW, Sherman BT, Lempicki RA (2008) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4(1):44–57CrossRefGoogle Scholar
  7. 7.
    Moosmann F, Nowak E, Jurie F (2008) Randomized clustering forests for image classification. IEEE Trans Pattern Anal Mach Intell 30(9):1632–1646CrossRefGoogle Scholar
  8. 8.
    Ducournau A, Bretto A, Rital S, Laget B (2012) A reductive approach to hypergraph clustering: an application to image segmentation. Pattern Recognit 45(7):2788–2803CrossRefGoogle Scholar
  9. 9.
    Chaira T (2011) A novel intuitionistic fuzzy C means clustering algorithm and its application to medical images. Appl Soft Comput 11(2):1711–1717CrossRefGoogle Scholar
  10. 10.
    Wang R, Ji W, Liu M, Wang X, Weng J, Deng S, Gao S, Yuan C (2018) Review on mining data from multiple data sources. Pattern Recognit Lett.  https://doi.org/10.1016/j.patrec.2018.01.013
  11. 11.
    Wu J, Jin L, Liu M (2015) Evolving RBF neural networks for rainfall prediction using hybrid particle swarm optimization and genetic algorithm. Neurocomputing 148(2):136–142CrossRefGoogle Scholar
  12. 12.
    Sunita AR, Jalal Anand S, Kumar JM (2010) A density based algorithm for discovering density varied clusters in large spatial databases. Int J Comput Appl 3(6):1–4Google Scholar
  13. 13.
    Hinneburg A, Gabriel H-H (2007) DENCLUE 2.0: fast clustering based on kernel density estimation. Adv Intell Data Anal VII Lect Notes Comput Sci 4723:70–80CrossRefGoogle Scholar
  14. 14.
    Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the second international conference on knowledge discovery and data mining, AAAI Press, Oregon, pp 226–231Google Scholar
  15. 15.
    Sander J, Ester M, Kriegel H, Xu X (1998) Density-based clustering in spatial data sets: the algorithm GDBSCAN and its applications. Data Min Knowl Disc 2:169–194CrossRefGoogle Scholar
  16. 16.
    Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) OPTICS, ordering points to identify the clustering structure. In: ACM SIGMOD international conference on management of data, pp 49–60Google Scholar
  17. 17.
    Xu X, Jager J, Kriegel H (1999) A fast parallel clustering algorithm for large spatial databases. Data Min Knowl Disc 3(3):263–290CrossRefGoogle Scholar
  18. 18.
    Zaiane O, Lee C (2002) Clustering spatial data in the presence of obstacles: a density-based approach. In: Proceedings of the IEEE symposium on international database engineering and applications, Edmonton, Canada, pp 214–223Google Scholar
  19. 19.
    Dash M, Liu H, Xu X (2001) ‘\(1+1 > 2\)’: merging distance and density based clustering. In: Proceedings of the seventh international conference on database systems for advanced applications, IEEE, Hong Kong, pp 32–39Google Scholar
  20. 20.
    Nasibov E, Ulutagay G (2009) Robustness of density-based clustering methods with various neighborhood relations. Fuzzy Sets Syst 160(24):3601–3615MathSciNetCrossRefGoogle Scholar
  21. 21.
    Kieu L-M, Bhaskar A, Chung E (2015) A modified density-based scanning algorithm with noise for spatial travel pattern analysis from smart card AFC data. Trans Res Part C 58:193–207CrossRefGoogle Scholar
  22. 22.
    Maadi AE, Djouadi MS (2015) Using a light DBSCAN algorithm for visual surveillance of crowded traffic scenes. IETE J Res 61(3):308–320CrossRefGoogle Scholar
  23. 23.
    Chen X (2015) A new clustering algorithm based on near neighbor influence. Exp Syst Appl 42:7746–7758CrossRefGoogle Scholar
  24. 24.
    Nanda SJ, Panda G (2015) Design of computationally efficient density-based clustering algorithms. Data Knowl Eng 95:23–38CrossRefGoogle Scholar
  25. 25.
    Liu P, Zhou D, Wu N (2007) VDBSCAN: varied density based spatial clustering of application with noise. In: Proceedings of the IEEE international conference on service systems and service management, Chengdu, pp 528–531Google Scholar
  26. 26.
    Hinneburg A, Keim D (1998) An efficient approach to clustering in large multimedia databases with noise. In: Proceedings of the fourth international conference on knowledge discovery and data mining, New York, pp 58–65Google Scholar
  27. 27.
    Ma D, Zhan A (2004) An adaptive density-based clustering algorithm for spatial database with noise. In: Proceedings of the fourth IEEE international conference on data mining, Brighton, UK, pp 467–470Google Scholar
  28. 28.
    Gupta G, Liu A, Ghosh J (2010) Automated hierarchical density shaving: a robust automated clustering and visualization framework for large biological data sets. IEEE/ACM Trans Comput Biol Bioinform 7(2):223–237CrossRefGoogle Scholar
  29. 29.
    Huang J, Sun H, Song Q, Deng H, Han J (2013) Revealing density-based clustering structure from the core-connected tree of a network. IEEE Knowl Data Eng 25(8):1876CrossRefGoogle Scholar
  30. 30.
    Li X, Ceikute V, Jensen CS, Tan K-L (2013) Effective online group discovery in trajectory databases. IEEE Knowl Data Eng 25(12):2752CrossRefGoogle Scholar
  31. 31.
    Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344:1492–1496CrossRefGoogle Scholar
  32. 32.
    Yu D, Ma X, Tu Y, Lai L (2015) Both piston-like and rotational motions are present in bacterial chemoreceptor signaling. Scientific Reports. 5, 8640, 02 March 2015Google Scholar
  33. 33.
    Chen Y-W, Lai D-H, Qi H, Wang J-L, Du J-X (2015) A new method to estimate ages of facial image for large database. Multimed Tools Appl 75:2877.  https://doi.org/10.1007/s11042-015-2485-9 CrossRefGoogle Scholar
  34. 34.
    Kumar P, Srinivasan B, Mohapatra NR (2015) Fast and accurate lithography simulation using cluster analysis in resist model building. J Micro/Nanolith MEMS MOEMS 14(2):023506CrossRefGoogle Scholar
  35. 35.
    Alcalá-Fdez J, Sánchez L, García S, del Jesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM, Fernández JC, Herrera F (2009) KEEL: a software tool to assess evolutionary algorithms to data mining problems. Soft Comput 13(3):307–318.  https://doi.org/10.1007/s00500-008-0323-y CrossRefGoogle Scholar
  36. 36.
    Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple-Valued Log Soft Comput 17(2–3):255–287Google Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.State Key Laboratory of Geohazard Prevention and Geoenvironment ProtectionChengdu University of TechnologyChengduChina
  2. 2.College of SciencesMassey UniversityAucklandNew Zealand
  3. 3.College of Earth ScienceChengdu University of TechnologyChengduChina

Personalised recommendations