CIARP 2014: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications pp 596-603 | Cite as
Learning Similarities by Accumulating Evidence in a Probabilistic Way
Abstract
Clustering ensembles take advantage of the diversity produced by multiple clustering algorithms to produce a consensual partition. Evidence accumulation clustering (EAC) combines the output of a clustering ensemble into a co-association similarity matrix, which contains the co-occurrences between pairs of objects in a cluster. A consensus partition is then obtained by applying a clustering technique over this matrix. We propose a new combination matrix, where the co-occurrences between objects are modeled in a probabilistic way. We evaluate the proposed methodology using the dissimilarity increments distribution model. This distribution is based on a high-order dissimilarity measure, which uses triplets of nearest neighbors to identify sparse and odd shaped clusters. Experimental results show that the new proposed algorithm produces better and more robust results than EAC in both synthetic and real datasets.
Keywords
Clustering ensembles co-association matrix voting scheme probablistic learning of similarities dissimilarity increments distributionPreview
Unable to display preview. Download preview PDF.
References
- 1.Aidos, H., Duin, R., Fred, A.: The area under the ROC curve as a criterion for clustering evaluation. In: Proc. of the 2nd Int. Conf. on Pattern Recognition Applications and Methods (ICPRAM 2013), Barcelona, pp. 276–280 (2013)Google Scholar
- 2.Aidos, H., Fred, A.: Statistical modeling of dissimilarity increments for d-dimensional data: Application in partitional clustering. Pattern Recognition 45(9), 3061–3071 (2012)CrossRefMATHGoogle Scholar
- 3.Ayad, H.G., Kamel, M.S.: Cumulative voting consensus method for partitions with variable number of clusters. IEEE Trans. Pattern Anal. Mach. Intell. 30(1), 160–173 (2008)CrossRefGoogle Scholar
- 4.Dimitriadou, E., Weingessel, A., Hornik, K.: A combination scheme for fuzzy clustering. In: Pal, N.R., Sugeno, M. (eds.) AFSS 2002. LNCS (LNAI), vol. 2275, pp. 332–338. Springer, Heidelberg (2002)CrossRefGoogle Scholar
- 5.Fern, X.Z., Brodley, C.E.: Solving cluster ensemble problems by bipartite graph partitioning. In: Machine Learning - Proc. of the 21st Int. Conf. (ICML 2004), Banff, Alberta, Canada (2004)Google Scholar
- 6.Fred, A.: Finding consistent clusters in data partitions. In: Kittler, J., Roli, F. (eds.) MCS 2001. LNCS, vol. 2096, pp. 309–318. Springer, Heidelberg (2001)CrossRefGoogle Scholar
- 7.Fred, A., Jain, A.: Combining multiple clusterings using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 835–850 (2005)CrossRefGoogle Scholar
- 8.Fred, A., Jain, A.: Cluster validation using a probabilistic attributed graph. In: 19th Int. Conf. on Pattern Recognition (ICPR 2008), Florida, USA, pp. 1–4 (2008)Google Scholar
- 9.Fred, A., Leitão, J.: A new cluster isolation criterion based on dissimilarity increments. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 944–958 (2003)CrossRefGoogle Scholar
- 10.Kuncheva, L.I., Hadjitodorov, S.T.: Using diversity in cluster ensembles. In: Proc. of the IEEE Int. Conf. on Systems, Man & Cybernetics, The Hague, Netherlands, pp. 1214–1219 (2004)Google Scholar
- 11.Lourenço, A., Bulò, S.R., Rebagliati, N., Fred, A., Figueiredo, M., Pelillo, M.: Probabilistic evidence accumulation for clustering ensembles. In: Proc. of the 2nd Int. Conf. on Pattern Recognition Applications and Methods (ICPRAM 2013), pp. 58–67. Barcelona (2013)Google Scholar
- 12.Meila, M., Pentney, W.: Clustering by weighted cuts in directed graphs. In: Proc. of the SIAM Int. Conf. on Data Mining (SDM 2007), pp. 135–144. Minnesota (2007)Google Scholar
- 13.Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)MathSciNetGoogle Scholar
- 14.Topchy, A., Jain, A., Punch, W.: Combining multiple weak clusterings. In: Proc. of the 3rd IEEE Int. Conf. on Data Mining (ICDM 2003), Melbourne, Florida, USA, pp. 331–338 (2003)Google Scholar
- 15.Wang, H., Shan, H., Banerjee, A.: Bayesian cluster ensembles. In: Proc. of the SIAM Int. Conf. on Data Mining (SDM 2009), Nevada, USA, pp. 211–222 (2009) Google Scholar