Data Mining and Knowledge Discovery

, Volume 27, Issue 3, pp 344–371

A framework for semi-supervised and unsupervised optimal extraction of clusters from hierarchies

  • R. J. G. B. Campello
  • D. Moulavi
  • A. Zimek
  • J. Sander
Article

DOI: 10.1007/s10618-013-0311-4

Cite this article as:
Campello, R.J.G.B., Moulavi, D., Zimek, A. et al. Data Min Knowl Disc (2013) 27: 344. doi:10.1007/s10618-013-0311-4

Abstract

We introduce a framework for the optimal extraction of flat clusterings from local cuts through cluster hierarchies. The extraction of a flat clustering from a cluster tree is formulated as an optimization problem and a linear complexity algorithm is presented that provides the globally optimal solution to this problem in semi-supervised as well as in unsupervised scenarios. A collection of experiments is presented involving clustering hierarchies of different natures, a variety of real data sets, and comparisons with specialized methods from the literature.

Keywords

Hierarchical clustering Optimal selection of clusters  Should-link and should-not-link constraints Cluster quality 

Copyright information

© The Author(s) 2013

Authors and Affiliations

  • R. J. G. B. Campello
    • 2
    • 1
  • D. Moulavi
    • 1
  • A. Zimek
    • 1
  • J. Sander
    • 1
  1. 1.Department of Computing SciencesUniversity of AlbertaEdmontonCanada
  2. 2.Department of Computer SciencesUniversity of São PauloSão CarlosBrazil

Personalised recommendations