Skip to main content

Density Based Clustering: Alternatives to DBSCAN

  • Chapter
  • First Online:
Partitional Clustering Algorithms

Abstract

Clustering data has been an important task in data analysis for years as it is now. The de facto standard algorithm for density-based clustering today is DBSCAN. The main drawback of this algorithm is the need to tune its two parameters ε and minPts. In this paper we explore the possibilities and limits of two novel different clustering algorithms. Both require just one DBSCAN-like parameter. Still they perform well on benchmark data sets. Our first approach just uses a parameter similar to DBSCAN’s minPts parameter that is used to incrementally find protoclusters which are eventually merged while discarding those that are too sparse. Our second approach only uses a local density without any minimum number of points to be specified. It estimates clusters by seeing them from spectators watching the data points at different angles. Both algorithms lead to results comparable to DBSCAN. Our first approach yields similar results to DBSCAN while being able to assign multiple cluster labels to a points while the second approach works significantly faster than DBSCAN.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Beil F, Ester M, Xu X (2002) Frequent term-based text clustering. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York

    Google Scholar 

  2. Bradley PS, Fayyad UM (1998) Refining initial points for K-means clustering. In: Proceedings of the fifteenth international conference on machine learning (ICML), vol 98, pp 91–99

    Google Scholar 

  3. Braune C, Borgelt C, Grün S (2012) Assembly detection in continuous neural spike train data. In: Advances in intelligent data analysis XI. Springer, Berlin, Heidelberg, pp 78–89

    Google Scholar 

  4. Braune C, Borgelt C, Kruse R (2013) Behavioral clustering for point processes. In: Advances in intelligent data analysis XII. Springer, Berlin, Heidelberg, pp 127–137

    Google Scholar 

  5. Bridges CC Jr (1966) Hierarchical cluster analysis. Psychol Rep 18(3):851–854

    Article  MathSciNet  Google Scholar 

  6. Celebi ME, Kingravi H (2012) Deterministic initialization of the K-means algorithm using hierarchical clustering. Int J Pattern Recognit Artif Intell 26(7):1250018

    Article  MathSciNet  Google Scholar 

  7. Celebi ME, Kingravi H, Vela PA (2013) A comparative study of efficient initialization methods for the K-means clustering algorithm. Expert Syst Appl 40(1):200–2010

    Article  Google Scholar 

  8. Döring C, Lesot MJ, Kruse R (2006) Data analysis with fuzzy clustering methods. Comput Stat Data Anal 51(1):192–214

    Article  MATH  Google Scholar 

  9. Esmaelnejad J, Habibi J, Yeganeh SH (2010) A novel method to find appropriate ε for DBSCAN. In: Intelligent information and database systems. Springer, Berlin, Heidelberg, pp 93–102

    Google Scholar 

  10. Ester M, Kriegel HP, Sander J, Xu X (1966) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd international conference on knowledge discovery and data mining (KDD), vol 96, pp 226–231

    Google Scholar 

  11. Gerstein, GL, Perkel DH (1969) Simultaneously recorded trains of action potentials: analysis and functional interpretation. Science 164(3881):828–830

    Article  Google Scholar 

  12. Hall LO, Bensaid AM, Clarke LP, Velthuizen RP, Silbiger MS, Bezdek JC (1992) A comparison of neural network and fuzzy clustering techniques in segmenting magnetic resonance images of the brain. IEEE Trans Neural Netw 3(5):672–682

    Article  Google Scholar 

  13. Hentschel C, Stober S, Nürnberger A, Detyniecki M (2008) Automatic image annotation using a visual dictionary based on reliable image segmentation. In: Adaptive multimedia retrieval: retrieval, user, and semantics. Springer, Berlin, Heidelberg, pp 45–56

    Google Scholar 

  14. Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218

    Article  Google Scholar 

  15. Jing L, Ng MK, Xu J, Huang JZ (2005) Subspace clustering of text documents with feature weighting k-means algorithm. In: Advances in knowledge discovery and data mining. Springer, Berlin, Heidelberg, pp 802–812

    Google Scholar 

  16. Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York

    Book  Google Scholar 

  17. Katayama N, Satoh S (1997) The SR-tree: an index structure for high-dimensional nearest neighbor queries. ACM SIGMOD Record 26(2):440–447

    Article  Google Scholar 

  18. Krinidis S, Chatzis V (2010) A robust fuzzy local information C-means clustering algorithm. IEEE Trans Image Process 19(5):1328–1337

    Article  MathSciNet  Google Scholar 

  19. Kruse R, Borgelt C, Klawonn F, Moewes C, Steinbrecher M, Held P (2013) Computational intelligence: a methodological introduction. Springer, Berlin

    Book  Google Scholar 

  20. Li Y, Luo C, Chung SM (2008) Text clustering with feature selection by using statistical data. IEEE Trans Knowl Data Eng 20(5):641–652

    Article  Google Scholar 

  21. MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley symposium on mathematical statistics and probability, vol 1. University of California Press, Berkeley, pp 281–297

    Google Scholar 

  22. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    MATH  MathSciNet  Google Scholar 

  23. Perkel DH, Gerstein GL, Moore GP (1967) Neuronal spike trains and stochastic point processes: II. Simultaneous spike trains. Biophys J 7(4):419–440

    Article  Google Scholar 

  24. Sugar CA, James GM (2003) Finding the number of clusters in a dataset. J Am Stat Assoc 98(463):750–763

    Article  MATH  MathSciNet  Google Scholar 

  25. Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1.1:67–82

    Article  Google Scholar 

  26. Zhang X, Jiao L, Liu F, Bo L, Gong M (2008) Spectral clustering ensemble applied to SAR image segmentation. IEEE Trans Geosci Remote Sens 46(7):2126–2136

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Braune .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Braune, C., Besecke, S., Kruse, R. (2015). Density Based Clustering: Alternatives to DBSCAN. In: Celebi, M. (eds) Partitional Clustering Algorithms. Springer, Cham. https://doi.org/10.1007/978-3-319-09259-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09259-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09258-4

  • Online ISBN: 978-3-319-09259-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics