Advertisement

Learning Similarities from Examples Under the Evidence Accumulation Clustering Paradigm

  • Ana L. N. FredEmail author
  • André Lourenço
  • Helena Aidos
  • Samuel Rota Bulò
  • Nicola Rebagliati
  • Mário A. T. Figueiredo
  • Marcello Pelillo
Part of the Advances in Computer Vision and Pattern Recognition book series (ACVPR)

Abstract

The SIMBAD project puts forward a unified theory of data analysis under a (dis)similarity based object representation framework. Our work builds on the duality of probabilistic and similarity notions on pairwise object comparison. We address the Evidence Accumulation Clustering paradigm as a means of learning pairwise similarity between objects, summarized in a co-association matrix. We show the dual similarity/probabilistic interpretation of the co-association matrix and exploit these for coherent consensus clustering methods, either exploring embeddings over learned pairwise similarities, in an attempt to better highlight the clustering structure of the data, or by means of a unified probabilistic approach leading to soft assignments of objects to clusters.

Keywords

Locally Linear Embedding Locality Preserve Projection Dimensionality Reduction Method Cluster Ensemble Consensus Cluster 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Aidos, H., Fred, A.: A study of embedding methods under the evidence accumulation framework. In: Pelillo, M., Hancock, E. (eds.) Similarity-Based Pattern Recognition. Lecture Notes in Computer Science, vol. 7005, pp. 290–305. Springer, Berlin (2011). http://link.springer.com/chapter/10.1007/978-3-642-24471-1_21 CrossRefGoogle Scholar
  2. 2.
    Ayad, H., Kamel, M.S.: Cumulative voting consensus method for partitions with variable number of clusters. IEEE Trans. Pattern Anal. Mach. Intell. 30(1), 160–173 (2008) CrossRefGoogle Scholar
  3. 3.
    Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in Neural Information Processing Systems (NIPS 2001), vol. 14, pp. 585–591 (2002) Google Scholar
  4. 4.
    Bezdek, J., Hathaway, R.: Vat: a tool for visual assessment of (cluster) tendency. In: Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN’02, vol. 3, pp. 2225–2230 (2002) Google Scholar
  5. 5.
    Boyd, S., Vandenberghe, L.: Convex Optimization, 1st edn. Cambridge University Press, Cambridge (2004) CrossRefzbMATHGoogle Scholar
  6. 6.
    Demartines, P., Hérault, J.: Curvilinear component analysis: a self-organizing neural network for nonlinear mapping of data sets. IEEE Trans. Neural Netw. 8(1), 148–154 (1997) CrossRefGoogle Scholar
  7. 7.
    Dimitriadou, E., Weingessel, A., Hornik, K.: A combination scheme for fuzzy clustering. In: AFSS’02, 332–338 (2002) Google Scholar
  8. 8.
    Fern, X.Z., Brodley, C.E.: Solving cluster ensemble problems by bipartite graph partitioning. In: Proc. ICML’04 (2004) Google Scholar
  9. 9.
    Fred, A.: Finding consistent clusters in data partitions. In: Kittler, J., Roli, F. (eds.) Multiple Classifier Systems, vol. 2096, pp. 309–318. Springer, Berlin (2001) CrossRefGoogle Scholar
  10. 10.
    Fred, A., Jain, A.: Data clustering using evidence accumulation. In: Proc. of the 16th Int’l Conference on Pattern Recognition, pp. 276–280 (2002) Google Scholar
  11. 11.
    Fred, A., Jain, A.: Combining multiple clustering using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 835–850 (2005) CrossRefGoogle Scholar
  12. 12.
    Fred, A.L., Jain, A.K.: Learning pairwise similarity for data clustering. In: Proc. of the 18th Int’l Conference on Pattern Recognition (ICPR 2006), pp. 925–928. IEEE Comput. Soc., Washington (2006). doi: 10.1109/ICPR.2006.754 CrossRefGoogle Scholar
  13. 13.
    Hadjitodorov, S.T., Kuncheva, L.I., Todorova, L.P.: Moderate diversity for better cluster ensembles. Inf. Fusion 7(3), 264–275 (2006) CrossRefGoogle Scholar
  14. 14.
    He, X., Niyogi, P.: Locality preserving projections. In: Advances in Neural Information Processing Systems (NIPS 2003), vol. 16 (2004) Google Scholar
  15. 15.
    He, X., Cai, D., Yan, S., Zhang, H.J.: Neighborhood preserving embedding. In: Proc. of the 10th Int. Conf. on Computer Vision (ICCV 2005), vol. 2, pp. 1208–1213 (2005) Google Scholar
  16. 16.
    Hofmann, T., Puzicha, J., Jordan, M.I.: Learning from Dyadic Data. Advances in Neural Information Processing Systems (NIPS), vol. 11. MIT Press, Cambridge (1999) Google Scholar
  17. 17.
    Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recognit. Lett. 31(8), 651–666 (2010) CrossRefGoogle Scholar
  18. 18.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31, 264–323 (1999) CrossRefGoogle Scholar
  19. 19.
    Kachurovskii, I.R.: On monotone operators and convex functionals. Usp. Mat. Nauk 15(4), 213–215 (1960) Google Scholar
  20. 20.
    Karypis, G., Kumar, V.: Multilevel algorithms for multi-constraint graph partitioning. In: Proceedings of the 10th Supercomputing Conference (1998) Google Scholar
  21. 21.
    Karypis, G., Aggarwal, R., Kumar, V., Shekhar, S.: Multilevel hypergraph partitioning: applications in vlsi domain. In: Proc. Design Automation Conf. (1997) Google Scholar
  22. 22.
    Kuncheva, L.I., Hadjitodorov, S.T.: Using diversity in cluster ensembles. In: Proc. of the IEEE International Conference on Systems, Man & Cybernetics, Hague, Netherlands, pp. 1214–1219 (2004) Google Scholar
  23. 23.
    Kuncheva, L., Hadjitodorov, S., Todorova, L.: Experimental comparison of cluster ensemble methods. In: 9th International Conference on Information Fusion, pp. 1–7 (2006). doi: 10.1109/ICIF.2006.301614 Google Scholar
  24. 24.
    Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Information Science and Statistics. Springer, Berlin (2007) CrossRefzbMATHGoogle Scholar
  25. 25.
    Lee, J.A., Lendasse, A., Verleysen, M.: Nonlinear projection with curvilinear distances: isomap versus curvilinear distance analysis. Neurocomputing 57, 49–76 (2004) CrossRefGoogle Scholar
  26. 26.
    Levina, E., Bickel, P.J.: Maximum likelihood estimation of intrinsic dimension. In: Advances in Neural Information Processing Systems (NIPS 2004), vol. 17 (2004) Google Scholar
  27. 27.
    Lourenço, A., Fred, A.: Selectively learning clusters in multi-EAC. In: International Conference on Knowledge Discovery and Information Retrieval (KDIR 2010), Valencia, Spain (2010) Google Scholar
  28. 28.
    Lourenço, A., Fred, A., Jain, A.K.: On the scalability of evidence accumulation clustering. In: ICPR. Istanbul Turkey (2010) Google Scholar
  29. 29.
    Lourenço, A., Fred, A., Figueiredo, M.: A generative dyadic aspect model for evidence accumulation clustering. In: Pelillo, M., Hancock, E. (eds.) Similarity-Based Pattern Recognition. Lecture Notes in Computer Science, vol. 7005, pp. 104–116. Springer, Berlin (2011). http://link.springer.com/chapter/10.1007/978-3-642-24471-1_8 CrossRefGoogle Scholar
  30. 30.
    Luenberger, D.G., Ye, Y.: Linear and Nonlinear Programming, 3rd edn. Springer, Berlin (2008) zbMATHGoogle Scholar
  31. 31.
    Meila, M.: Comparing clusterings by the variation of information. In: Proc. of the Sixteenth Annual Conf. of Computational Learning Theory (COLT). Springer, Berlin (2003) Google Scholar
  32. 32.
    Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: NIPS, pp. 849–856. MIT Press, Cambridge (2001) Google Scholar
  33. 33.
    Punera, K., Ghosh, J.: Advances in Fuzzy Clustering and Its Applications, Chap. Soft Consensus Clustering. Wiley, New York (2007) Google Scholar
  34. 34.
    Rota Bulò, S., Lourenço, A., Fred, A., Pelillo, M.: Pairwise probabilistic clustering using evidence accumulation. In: Proc. 2010 Int. Conf. on Structural, Syntactic, and Statistical Pattern Recognition, SSPR&SPR’10, pp. 395–404 (2010) CrossRefGoogle Scholar
  35. 35.
    Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000) CrossRefGoogle Scholar
  36. 36.
    Sammon, J.W.: A nonlinear mapping for data structure analysis. IEEE Trans. Comput. 18(5), 401–409 (1969) CrossRefGoogle Scholar
  37. 37.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000) CrossRefGoogle Scholar
  38. 38.
    Steyvers, M., Griffiths, T.: Probabilistic Topic Models, Chap. Latent Semantic Analysis: a Road to Meaning. Laurence Erlbaum, Hillsdale (2007) Google Scholar
  39. 39.
    Strehl, A., Ghosh, J.: Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002) MathSciNetGoogle Scholar
  40. 40.
    Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000) CrossRefGoogle Scholar
  41. 41.
    Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Elsevier, Amsterdam (2003) Google Scholar
  42. 42.
    Topchy, A., Jain, A., Punch, W.: Combining multiple weak clusterings. In: IEEE Intl. Conf. on Data Mining, Melbourne, FL, pp. 331–338 (2003) CrossRefGoogle Scholar
  43. 43.
    Topchy, A., Jain, A., Punch, W.: A mixture model of clustering ensembles. In: Proc. of the SIAM Conf. on Data Mining (2004) Google Scholar
  44. 44.
    Topchy, A., Jain, A.K., Punch, W.: Clustering ensembles: models of consensus and weak partitions. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1866–1881 (2005) CrossRefGoogle Scholar
  45. 45.
    Wang, H., Shan, H., Banerjee, A.: Bayesian cluster ensembles. In: 9th SIAM Int. Conf. on Data Mining (2009) Google Scholar
  46. 46.
    Wang, P., Domeniconi, C., Laskey, K.B.: Nonparametric Bayesian clustering ensembles. In: ECML PKDD’10, pp. 435–450 (2010) Google Scholar

Copyright information

© Springer-Verlag London 2013

Authors and Affiliations

  • Ana L. N. Fred
    • 1
    Email author
  • André Lourenço
    • 2
    • 3
  • Helena Aidos
    • 1
  • Samuel Rota Bulò
    • 4
  • Nicola Rebagliati
    • 5
  • Mário A. T. Figueiredo
    • 1
  • Marcello Pelillo
    • 6
  1. 1.Instituto de TelecomunicaçõesInstituto Superior TécnicoLisbonPortugal
  2. 2.Instituto Superior de Engenharia de LisboaLisbonPortugal
  3. 3.Instituto de TelecomunicaçõesLisbonPortugal
  4. 4.Fondazione Bruno Kessler, PovoTrentoItaly
  5. 5.VTT Technical Research Centre of FinlandEspooFinland
  6. 6.DAISUniversità Ca’ FoscariVeneziaItaly

Personalised recommendations