Abstract
Complete-linkage clustering is a very popular method for computing hierarchical clusterings in practice, which is not fully understood theoretically. Given a finite set \(P\subseteq \mathbb {R}^d\) of points, the complete-linkage method starts with each point from P in a cluster of its own and then iteratively merges two clusters from the current clustering that have the smallest diameter when merged into a single cluster. We study the problem of partitioning P into k clusters such that the largest diameter of the clusters is minimized and we prove that the complete-linkage method computes an O(1)-approximation for this problem for any metric that is induced by a norm, assuming that the dimension d is a constant. This improves the best previously known bound of \(O(\log {k})\) due to Ackermann et al. (Algorithmica 69(1):184–215, 2014). Our improved bound also carries over to the k-center and the discrete k-center problem.
Similar content being viewed by others
References
Ackermann, M.R., Blömer, J., Kuntze, D., Sohler, C.: Analysis of agglomerative clustering. Algorithmica 69(1), 184–215 (2014)
Cole, J.R., Wang, Q., Fish, J.A., Chai, B., McGarrell, D.M., Sun, Y., Brown, C.T., Porras-Alfaro, A., Kuske, C.R., Tiedje, J.M.: Ribosomal database project: data and tools for high throughput rrna analysis. Nucl. Acids Res. 42, D633–D642 (2013)
Dasgupta, S., Long, P.M.: Performance guarantees for hierarchical clustering. J. Comput. Syst. Sci. 70(4), 555–569 (2005)
Defays, D.: An efficient algorithm for a complete link method. Comput. J. 20(4), 364–366 (1977)
Feder, T., Greene, D.H.: Optimal algorithms for approximate clustering. In: Proceedings of the 20th Annual ACM Symposium on Theory of Computing (STOC), pp. 434–444 (1988)
Ghaemmaghami, H., Dean, D., Vogt, R., Sridharan, S.: Speaker attribution of multiple telephone conversations using a complete-linkage clustering approach. In: Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4185–4188 (2012)
Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19(4), 639–668 (2011)
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported by ERC Starting Grant 306465 (BeyondWorstCase). A preliminary version of this work appeared in the proceedings of ESA 2015.
Rights and permissions
About this article
Cite this article
Großwendt, A., Röglin, H. Improved Analysis of Complete-Linkage Clustering. Algorithmica 78, 1131–1150 (2017). https://doi.org/10.1007/s00453-017-0284-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-017-0284-6