A New, Fast and Accurate Algorithm for Hierarchical Clustering on Euclidean Distances

  • Elio Masciari
  • Giuseppe Massimiliano Mazzeo
  • Carlo Zaniolo
Conference paper

DOI: 10.1007/978-3-642-37456-2_10

Part of the Lecture Notes in Computer Science book series (LNCS, volume 7819)
Cite this paper as:
Masciari E., Mazzeo G.M., Zaniolo C. (2013) A New, Fast and Accurate Algorithm for Hierarchical Clustering on Euclidean Distances. In: Pei J., Tseng V.S., Cao L., Motoda H., Xu G. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2013. Lecture Notes in Computer Science, vol 7819. Springer, Berlin, Heidelberg

Abstract

A simple hierarchical clustering algorithm called CLUBS (for CLustering Using Binary Splitting) is proposed. CLUBS is faster and more accurate than existing algorithms, including k-means and its recently proposed refinements. The algorithm consists of a divisive phase and an agglomerative phase; during these two phases, the samples are repartitioned using a least quadratic distance criterion possessing unique analytical properties that we exploit to achieve a very fast computation. CLUBS derives good clusters without requiring input from users, and it is robust and impervious to noise, while providing better speed and accuracy than methods, such as BIRCH, that are endowed with the same critical properties.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Elio Masciari
    • 1
  • Giuseppe Massimiliano Mazzeo
    • 1
  • Carlo Zaniolo
    • 2
  1. 1.ICAR-CNRItaly
  2. 2.UCLAUSA

Personalised recommendations