Advertisement

Neighbourhood Contrast: A Better Means to Detect Clusters Than Density

  • Bo Chen
  • Kai Ming Ting
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10939)

Abstract

Most density-based clustering algorithms suffer from large density variations among clusters. This paper proposes a new measure called Neighbourhood Contrast (NC) as a better alternative to density in detecting clusters. The proposed NC admits all local density maxima, regardless of their densities, to have similar NC values. Due to this unique property, NC is a better means to detect clusters in a dataset with large density variations among clusters. We provide two applications of NC. First, replacing density with NC in the current state-of-the-art clustering procedure DP leads to significantly improved clustering performance. Second, we devise a new clustering algorithm called Neighbourhood Contrast Clustering (NCC) which does not require density or distance calculations, and therefore has a linear time complexity in terms of dataset size. Our empirical evaluation shows that both NC-based methods outperform density-based methods including the current state-of-the-art.

Keywords

Neighbourhood Contrast Clustering 

Notes

Acknowledgments

Bo Chen is supported by scholarships provided by Data61, CSIRO and Faculty of IT, Monash University.

References

  1. 1.
    Aggarwal, C.C., Reddy, C.K.: Data Clustering: Algorithms and Applications. Chapman and Hall/CRC Press, Boca Raton (2013)zbMATHGoogle Scholar
  2. 2.
    Borah, B., Bhattacharyya, D.: DDSC: a density differentiated spatial clustering technique. J. Comput. 3(2), 72–79 (2008)CrossRefGoogle Scholar
  3. 3.
    Ertöz, L., Steinbach, M., Kumar, V.: Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data. In: Proceedings of the SIAM Conference on Data Mining, pp. 47–58. SIAM (2003)CrossRefGoogle Scholar
  4. 4.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226–231 (1996)Google Scholar
  5. 5.
    Gionis, A., Mannila, H., Tsaparas, P.: Clustering aggregation. ACM Trans. Knowl. Discov. Data (TKDD) 1(1), 4 (2007)CrossRefGoogle Scholar
  6. 6.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann, Burlington (2011)zbMATHGoogle Scholar
  7. 7.
    Jain, A.K., Law, M.H.C.: Data clustering: a user’s dilemma. In: Pal, S.K., Bandyopadhyay, S., Biswas, S. (eds.) PReMI 2005. LNCS, vol. 3776, pp. 1–10. Springer, Heidelberg (2005).  https://doi.org/10.1007/11590316_1CrossRefGoogle Scholar
  8. 8.
    Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
  9. 9.
    Müller, E., Günnemann, S., Assent, I., Seidl, T.: Evaluating clustering in subspace projections of high dimensional data. In: Proceedings of the VLDB Endowment, vol. 2, no. 1, pp. 1270–1281 (2009)CrossRefGoogle Scholar
  10. 10.
    Ram, A., Sharma, A., Jalal, A.S., Agrawal, A., Singh, R.: An enhanced density based spatial clustering of applications with noise. In: Proceedings of the IEEE International Advance Computing Conference, pp. 1475–1478. IEEE (2009)Google Scholar
  11. 11.
    Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)CrossRefGoogle Scholar
  12. 12.
    Veenman, C.J., Reinders, M.J.T., Backer, E.: A maximum variance cluster algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 24(9), 1273–1280 (2002)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Monash UniversityClaytonAustralia
  2. 2.Federation University AustraliaChurchillAustralia

Personalised recommendations