Skip to main content

Merging DBSCAN and Density Peak for Robust Clustering

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2019: Text and Time Series (ICANN 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11730))

Included in the following conference series:

Abstract

In data clustering, density based algorithms are well known for the ability of detecting clusters of arbitrary shapes. DBSCAN is a widely used density based clustering approach, and the recently proposed density peak algorithm has shown significant potential in experiments. However, the DBSCAN algorithm may misclassify border data points of small density as noises and does not work well with large density variance across clusters, and the density peak algorithm has a large dependence on the detected cluster centers. To circumvent these problems, we make a study of these two algorithms and find that they have some complementary properties. We then propose to combine these two algorithms to overcome their problems. Specifically, we use the DP algorithm to detect cluster centers and then determine the parameters for DBSCAN adaptively. After DBSCAN clustering, we further use the DP algorithm to include border data points of small density into clusters. By combining the complementary properties of these two algorithms, we manage to relieve the problems of DBSCAN and avoid the drawbacks of the density peak algorithm in the meanwhile. Our algorithm is tested with synthetic and real datasets, and is demonstrated to perform better than DBSCAN and density peak algorithms, as well as some other clustering algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Achtert, E., Bohm, C., Kroger, P.: Deli-clu: boosting robustness, completeness, usability, and efficiency of hierarchical clustering by a closest pair ranking. In: International Conference on Knowledge Discovery and Data Mining, pp. 119–128 (2006)

    Chapter  Google Scholar 

  2. Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: Optics: Ordering points to identify the clustering structure. In: ACM SIGMOD International Conference on Management of Data, pp. 49–60 (1999). https://doi.org/10.1145/304182.304187

  3. Bai, L., Cheng, X., Liang, J., Shen, H., Guo, Y.: Fast density clustering strategies based on the k-means algorithm. Pattern Recogn. 71, 375–386 (2017). https://doi.org/10.1016/j.patcog.2017.06.023

    Article  Google Scholar 

  4. Bansal, N., Blum, A., Chawla, S.: Correlation clustering. Mach. Learn. 56(1–3), 89–113 (2004). https://doi.org/10.1023/B:MACH.0000033116.57574.95

    Article  MathSciNet  MATH  Google Scholar 

  5. Brendan, J.F., Delbert, D.: Clustering by passing messages between data points. Science 315, 972–976 (2007). https://doi.org/10.1126/science.1136800

    Article  MathSciNet  MATH  Google Scholar 

  6. Chang, H., Yeung, D.Y.: Robust path-based spectral clustering. Pattern Recogn. 41(1), 191–203 (2008). https://doi.org/10.1016/j.patcog.2007.04.010

    Article  MATH  Google Scholar 

  7. Chen, Y., Tang, S., Bouguil, N., Wang, C., Du, J., Li, H.: A fast clustering algorithm based on pruning unnecessary distance computations in dbscan for high-dimensional data. Pattern Recogn. 83, 375–387 (2018). https://doi.org/10.1016/j.patcog.2018.05.030

    Article  Google Scholar 

  8. Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995). https://doi.org/10.1109/34.400568

    Article  Google Scholar 

  9. Comaniciu, D., Peter, M.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002). https://doi.org/10.1109/34.1000236

    Article  Google Scholar 

  10. Daszykowski, M., Walczak, B., Massart, D.L.: Looking for natural patterns in data: Part 1. density-based approach. Chemometr. Intell. Lab. Syst. 56(2), 83–92 (2001). https://doi.org/10.1016/s0169-7439(01)00111-3

    Article  Google Scholar 

  11. Dong, S., Liu, J., Liu, Y., Zeng, L., Xu, C., Zhou, T.: Clustering based on grid and local density with priority-based expansion for multi-density data. Inf. Sci. 468, 103–116 (2018). https://doi.org/10.1016/j.ins.2018.08.018

    Article  Google Scholar 

  12. Ester, M., Kriegel, H.P., Sander, J., Xu, X.W.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: International Conference on Knowledge Discovery and Data Mining, pp. 226–231 (1996)

    Google Scholar 

  13. Ferone, A., Maratea, A.: Integrating rough set principles in the graded possibilistic clustering. Inf. Sci. 477, 148–160 (2019). https://doi.org/10.1016/j.ins.2018.10.038

    Article  MathSciNet  Google Scholar 

  14. Fu, L., Medico, E.: Flame, a novel fuzzy clustering method for the analysis of dna microarray data. BMC Bioinform. 8(1), 1–17 (2007). https://doi.org/10.1186/1471-2105-8-3

    Article  Google Scholar 

  15. Gao, H., Nie, F., Li, X., Huang, H.: Multi-view subspace clustering. In: IEEE International Conference on Computer Vision, pp. 4238–4246 (2015). https://doi.org/10.1109/ICCV.2015.482

  16. Gionis, A., Mannila, H., Tsaparas, P.: Clustering aggregation. ACM Trans. Knowl. Discov. Data 1(1), 1–30 (2007). https://doi.org/10.1145/1217299.1217303

    Article  Google Scholar 

  17. Hinnerberg, A., Keim, D.: An efficient approach to clustering large multimedia databases with noise. In: International Conference on Knowledge Discovery and Data Mining, pp. 58–65 (1998)

    Google Scholar 

  18. Hou, J., Gao, H., Li, X.: DSets-DBSCAN: a parameter-free clustering algorithm. IEEE Trans. Image Process. 25(7), 3182–3193 (2016). https://doi.org/10.1109/TIP.2016.2559803

    Article  MathSciNet  MATH  Google Scholar 

  19. Hou, J., Gao, H., Li, X.: Feature combination via clustering. IEEE Trans. Neural Networks Learn. Syst. 29(4), 896–907 (2018). https://doi.org/10.1109/TNNLS.2016.2645883

    Article  Google Scholar 

  20. Hou, J., Liu, W.: Clustering based on dominant set and cluster expansion. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 76–87 (2017)

    Chapter  Google Scholar 

  21. Hou, J., Liu, W.: Parameter independent clustering based on dominant sets and cluster merging. Inf. Sci. 405, 1–17 (2017). https://doi.org/10.1016/j.ins.2017.04.006

    Article  Google Scholar 

  22. Hou, J., Liu, W.: A parameter independent clustering framework. IEEE Trans. Industr. Inf. 13(4), 1825–1832 (2017). https://doi.org/10.1109/TII.2017.2656909

    Article  Google Scholar 

  23. Jain, A.K., Law, M.H.C.: Data clustering: a user’s dilemma. In: International Conference on Pattern Recognition and Machine Intelligence, pp. 1–10 (2005)

    Google Scholar 

  24. Kumar, K.M., Reddy, A.R.M.: A fast dbscan clustering algorithm by accelerating neighbor searching using groups method. Pattern Recogn. 58, 39–48 (2016). https://doi.org/10.1016/j.patcog.2016.03.008

    Article  Google Scholar 

  25. Li, C., You, C., Vidal, R.: Structured sparse subspace clustering: a joint affinity learning and subspace clustering framework. IEEE Trans. Image Process. 26(6), 2988–3001 (2017). https://doi.org/10.1109/TIP.2017.2691557

    Article  MathSciNet  MATH  Google Scholar 

  26. Li, J., Wang, C., Li, P., Lai, J.: Discriminative metric learning for multi-view graph partitioning. Pattern Recogn. 75, 199–213 (2018). https://doi.org/10.1016/j.patcog.2017.06.012

    Article  Google Scholar 

  27. Li, Q., Liu, W., Li, L.: Affinity learning via a diffusion process for subspace clustering. Pattern Recogn. 84, 39–50 (2018). https://doi.org/10.1016/j.patcog.2018.07.002

    Article  Google Scholar 

  28. Liu, R., Wang, H., Yu, X.: Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inf. Sci. 450, 200–226 (2018). https://doi.org/10.1016/j.ins.2018.03.031

    Article  MathSciNet  Google Scholar 

  29. von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007). https://doi.org/10.1007/s11222-007-9033-z

    Article  MathSciNet  Google Scholar 

  30. Mequanint, E.Z., Pelillo, M.: Interactive image segmentation using constrained dominant sets. In: European Conference on Computer Vision, pp. 278–294 (2016)

    Google Scholar 

  31. Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, pp. 849–856 (2002)

    Google Scholar 

  32. Pavan, M., Pelillo, M.: Dominant sets and pairwise clustering. IEEE Trans. Pattern Anal. Mach. Intell. 29(1), 167–172 (2007). https://doi.org/10.1109/TPAMI.2007.250608

    Article  Google Scholar 

  33. Qiu, T., Li, C., Li, Y.: D-NND: a hierarchical density clustering method via nearest neighbor descent. In: International Conference on Pattern Recognition, pp. 1414–1419 (2018). https://doi.org/10.1109/ICPR.2018.8545142

  34. Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344, 1492–1496 (2014). https://doi.org/10.1126/science.1242072

    Article  Google Scholar 

  35. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 167–172 (2000). https://doi.org/10.1109/34.868688

    Article  Google Scholar 

  36. Tripodi, R., Pelillo, M.: A game-theoretic approach to word sense disambiguation. Comput. Linguist. 43(1), 31–70 (2017)

    Article  MathSciNet  Google Scholar 

  37. Vascon, S., Mequanint, E.Z., Cristani, M., Hung, H., Pelillo, M., Murino, V.: Detecting conversational groups in images and sequences: a robust game-theoretic approach. Comput. Vis. Image Underst. 143, 11–24 (2016). https://doi.org/10.1016/j.cviu.2015.09.012

    Article  Google Scholar 

  38. Veenman, C.J., Reinders, M., Backer, E.: A maximum variance cluster algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 24(9), 1273–1280 (2002). https://doi.org/A maximum variance cluster algorithm

    Article  Google Scholar 

  39. Yu, J., Chaomurilige, C., Yang, M.S.: On convergence and parameter selection of the EM and DA-EM algorithms for gaussian mixtures. Pattern Recogn. 77, 188–203 (2018). https://doi.org/10.1016/j.patcog.2017.12.014

    Article  Google Scholar 

  40. Zahn, C.T.: Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans. Comput. 20(1), 68–86 (1971). https://doi.org/10.1109/t-c.1971.223083

    Article  MATH  Google Scholar 

  41. Zhang, H., Ren, P.: Game theoretic hypergraph matching for multi-source image correspondences. Pattern Recogn. Lett. (2016). https://doi.org/10.1016/j.patrec.2016.07.011

    Article  Google Scholar 

  42. Zhong, C., Miao, D., Fránti, P.: Minimum spanning tree based split-and-merge: a hierarchical clustering method. Inf. Sci. 181(16), 3397–3410 (2011). https://doi.org/10.1016/j.ins.2011.04.013

    Article  Google Scholar 

  43. Zhu, X., Loy, C.C., Gong, S.: Constructing robust affinity graphs for spectral clustering. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1450–1457 (2014). https://doi.org/10.1109/cvpr.2014.188

Download references

Acknowledgement

This work is supported in part by the National Natural Science Foundation of China under Grant No. 61473045, and by the Natural Science Foundation of Liaoning Province under Grant No. 20170540013.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian Hou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hou, J., Lv, C., Zhang, A., E, X. (2019). Merging DBSCAN and Density Peak for Robust Clustering. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Text and Time Series. ICANN 2019. Lecture Notes in Computer Science(), vol 11730. Springer, Cham. https://doi.org/10.1007/978-3-030-30490-4_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30490-4_48

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30489-8

  • Online ISBN: 978-3-030-30490-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics