A MAP Approach to Evidence Accumulation Clustering

  • André Lourenço
  • Samuel Rota Bulò
  • Nicola Rebagliati
  • Ana Fred
  • Mário Figueiredo
  • Marcello Pelillo
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 318)

Abstract

The Evidence Accumulation Clustering (EAC) paradigm is a clustering ensemble method which derives a consensus partition from a collection of base clusterings obtained using different algorithms. It collects from the partitions in the ensemble a set of pairwise observations about the co-occurrence of objects in a same cluster and it uses these co-occurrence statistics to derive a similarity matrix, referred to as co-association matrix. The Probabilistic Evidence Accumulation for Clustering Ensembles (PEACE) algorithm is a principled approach for the extraction of a consensus clustering from the observations encoded in the co-association matrix based on a probabilistic model for the co-association matrix parameterized by the unknown assignments of objects to clusters. In this paper we extend the PEACE algorithm by deriving a consensus solution according to a MAP approach with Dirichlet priors defined for the unknown probabilistic cluster assignments. In particular, we study the positive regularization effect of Dirichlet priors on the final consensus solution with both synthetic and real benchmark data.

Keywords

Clustering algorithm Clustering ensembles Probabilistic modeling Evidence accumulation clustering Prior knowledge 

Notes

Acknowledgments

This work was partially financed by an ERCIM “Alain Bensoussan” Fellowship Programme under the European Union Seventh Framework Programme (FP7/2007–2013), grant agreement n. 246016, by FCT under grants SFRH /PROTEC/49512/2009, PTDC/EEI-SII/2312/2012 (LearningS project) and PEst-OE/ EEI/LA0008/2011, and by the Área Departamental de Engenharia Electronica e Telecomunicações e de Computadores of Instituto Superior de Engenharia de Lisboa, whose support the authors gratefully acknowledge.

References

  1. 1.
    Fred, A.: Finding consistent clusters in data partitions. In: Kittler, J., Roli, F. (eds.) Multiple Classifier Systems, pp. 309–318. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  2. 2.
    Fred, A., Jain, A.: Data clustering using evidence accumulation. In: Proceedings of the 16th International Conference on Pattern Recognition, pp. 276–280 (2002)Google Scholar
  3. 3.
    Fred, A., Jain, A.: Combining multiple clustering using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 835–850 (2005)CrossRefGoogle Scholar
  4. 4.
    Rota Bulò, S., Lourenço, A., Fred, A., Pelillo, M.: Pairwise probabilistic clustering using evidence accumulation. In: Proceedings of 2010 International Conference on Structural, Syntactic, and Statistical Pattern Recognition. SSPR&SPR’10, pp. 395–404 (2010)Google Scholar
  5. 5.
    Lourenço, A., Rota Bulò, S., Rebagliati, N., Figueiredo, M.A.T., Fred, A.L.N., Pelillo, M.: Probabilistic evidence accumulation for clustering ensembles (2013)Google Scholar
  6. 6.
    Luenberger, D.G., Ye, Y.: Linear and Nonlinear Programming, 3rd edn. Springer, Heidelberg (2008)MATHGoogle Scholar
  7. 7.
    Boyd, S., Vandenberghe, L.: Convex Optimization, 1st edn. Cambridge University, Cambridge (2004)CrossRefMATHGoogle Scholar
  8. 8.
    Kachurovskii, I.R.: On monotone operators and convex functionals. Uspekhi Mat. Nauk 15(4), 213–215 (1960)Google Scholar
  9. 9.
    Ghosh, J., Acharya, A.: Cluster ensembles. Wiley Interdisc. Rew. Data Min. Knowl. Disc. 1(4), 305–315 (2011)CrossRefGoogle Scholar
  10. 10.
    Strehl, A., Ghosh, J.: Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)MathSciNetGoogle Scholar
  11. 11.
    Topchy, A., Jain, A., Punch, W.: A mixture model of clustering ensembles. In: Proceedings of the SIAM Conference on Data Mining, April 2004Google Scholar
  12. 12.
    Dimitriadou, E., Weingessel, A., Hornik, K.: A combination scheme for fuzzy clustering. In: AFSS’02, pp. 332–338 (2002)Google Scholar
  13. 13.
    Ayad, H., Kamel, M.S.: Cumulative voting consensus method for partitions with variable number of clusters. IEEE Trans. Pattern Anal. Mach. Intell. 30(1), 160–173 (2008)CrossRefGoogle Scholar
  14. 14.
    Fern, X.Z., Brodley, C.E.: Solving cluster ensemble problems by bipartite graph partitioning. In: Proc ICML’04 (2004)Google Scholar
  15. 15.
    Lourenço, A., Fred, A., Figueiredo, M.: A generative dyadic aspect model for evidence accumulation clustering. In: Proceedings of 1st International Conference Similarity-based Pattern Recognition. SIMBAD’11, pp. 104–116. Springer, Heidelberg (2011)Google Scholar
  16. 16.
    Topchy, A., Jain, A.K., Punch, W.: Clustering ensembles: models of consensus and weak partitions. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1866–1881 (2005)CrossRefGoogle Scholar
  17. 17.
    Wang, H., Shan, H., Banerjee, A.: Bayesian cluster ensembles. In: 9th SIAM International Conference on Data Mining (2009)Google Scholar
  18. 18.
    Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Nat. Acad. Sci. USA 101(Suppl 1), 5228–5235 (2004)CrossRefGoogle Scholar
  19. 19.
    Steyvers, M., Griffiths, T.: Latent semantic analysis: a road to meaning. In: Probabilistic Topic Models. Laurence Erlbaum (2007)Google Scholar
  20. 20.
    Wang, P., Domeniconi, C., Laskey, K. B.: Nonparametric bayesian clustering ensembles. In: ECML PKDD’10, pp. 435–450 (2010)Google Scholar
  21. 21.
    Meila, M.: Comparing clusterings by the variation of information. In: Proceedings of the Sixteenth Annual Conference of Computational Learning Theory (COLT). Springer, Heidelberg (2003)Google Scholar
  22. 22.
    Lourenço, A., Fred, A., Jain, A.K.: On the scalability of evidence accumulation clustering. In: 20th International Conference on Pattern Recognition (ICPR), Istanbul Turkey, pp. 782–785, Aug 2010Google Scholar
  23. 23.
    Jain, A.K., Dubes, R.: Algorithms for Clustering Data. Prentice Hall, New Jersey (1988)MATHGoogle Scholar
  24. 24.
    Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: NIPS, pp. 849–856. MIT, Cambridge (2001)Google Scholar
  25. 25.
    Manning, C.D., Raghavan, P., Schtze, H.: Introduction to Information Retrieval. Cambridge University, New York (2008)CrossRefMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • André Lourenço
    • 1
    • 2
  • Samuel Rota Bulò
    • 4
  • Nicola Rebagliati
    • 6
  • Ana Fred
    • 3
  • Mário Figueiredo
    • 2
  • Marcello Pelillo
    • 5
  1. 1.Instituto Superior de Engenharia de Lisboa, Instituto de TelecomunicaçõesLisbonPortugal
  2. 2.Instituto de Telecomunicações, Instituto Superior TécnicoLisbonPortugal
  3. 3.Instituto de TelecomunicaçõesScientific Area of Networks and MultimediaLisbonPortugal
  4. 4.Fondazione Bruno KesslerTrentoItaly
  5. 5.DAISUniversità Ca’ Foscari VeneziaVeniceItaly
  6. 6.VTTEspooFinland

Personalised recommendations