Machine Learning

, Volume 98, Issue 1, pp 331–357

Probabilistic consensus clustering using evidence accumulation

Authors

  • André Lourenço
    • Instituto Superior de Engenharia de Lisboa
    • Instituto de Telecomunicações
  • Nicola Rebagliati
    • VTT Technical Research Center of Finland
  • Ana L. N. Fred
    • Instituto de Telecomunicações
    • Instituto Superior Técnico
  • Mário A. T. Figueiredo
    • Instituto de Telecomunicações
    • Instituto Superior Técnico
  • Marcello Pelillo
    • DAIS
Article

DOI: 10.1007/s10994-013-5339-6

Cite this article as:
Lourenço, A., Rota Bulò, S., Rebagliati, N. et al. Mach Learn (2015) 98: 331. doi:10.1007/s10994-013-5339-6

Abstract

Clustering ensemble methods produce a consensus partition of a set of data points by combining the results of a collection of base clustering algorithms. In the evidence accumulation clustering (EAC) paradigm, the clustering ensemble is transformed into a pairwise co-association matrix, thus avoiding the label correspondence problem, which is intrinsic to other clustering ensemble schemes. In this paper, we propose a consensus clustering approach based on the EAC paradigm, which is not limited to crisp partitions and fully exploits the nature of the co-association matrix. Our solution determines probabilistic assignments of data points to clusters by minimizing a Bregman divergence between the observed co-association frequencies and the corresponding co-occurrence probabilities expressed as functions of the unknown assignments. We additionally propose an optimization algorithm to find a solution under any double-convex Bregman divergence. Experiments on both synthetic and real benchmark data show the effectiveness of the proposed approach.

Keywords

Consensus clusteringEvidence AccumulationEnsemble clusteringBregman divergence

Copyright information

© The Author(s) 2013