Advertisement

Improved Analysis of Complete-Linkage Clustering

  • Anna Großwendt
  • Heiko Röglin
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9294)

Abstract

Complete-linkage clustering is a very popular method for computing hierarchical clusterings in practice, which is not fully understood theoretically. Given a finite set P ⊆ ℝ d of points, the complete-linkage method starts with each point from P in a cluster of its own and then iteratively merges two clusters from the current clustering that have the smallest diameter when merged into a single cluster.

We study the problem of partitioning P into k clusters such that the largest diameter of the clusters is minimized and we prove that the complete-linkage method computes an O(1)-approximation for this problem for any metric that is induced by a norm, assuming that the dimension d is a constant. This improves the best previously known bound of O(logk) due to Ackermann et al. (Algorithmica, 2014). Our improved bound also carries over to the k-center and the discrete k-center problem.

Keywords

Active Component Span Tree Approximation Factor Optimal Cluster Nonempty Intersection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ackermann, M.R., Blömer, J., Kuntze, D., Sohler, C.: Analysis of agglomerative clustering. Algorithmica 69(1), 184–215 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Cole, J.R., Wang, Q., Fish, J.A., Chai, B., McGarrell, D.M., Sun, Y., Brown, C.T., Porras-Alfaro, A., Kuske, C.R., Tiedje, J.M.: Ribosomal database project: data and tools for high throughput rrna analysis. Nucleic Acids Research (2013)Google Scholar
  3. 3.
    Dasgupta, S., Long, P.M.: Performance guarantees for hierarchical clustering. Journal of Computer and System Sciences 70(4), 555–569 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Defays, D.: An efficient algorithm for a complete link method. The Computer Journal 20(4), 364–366 (1977)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Feder, T., Greene, D.H.: Optimal algorithms for approximate clustering. In: Proc. of the 20th Annual ACM Symposium on Theory of Computing (STOC), pp. 434–444 (1988)Google Scholar
  6. 6.
    Ghaemmaghami, H., Dean, D., Vogt, R., Sridharan, S.: Speaker attribution of multiple telephone conversations using a complete-linkage clustering approach. In: Proc. of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4185–4188 (2012)Google Scholar
  7. 7.
    Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. Journal of Computer Security 19(4), 639–668 (2011)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of BonnBonnGermany

Personalised recommendations