International Workshop on Machine Learning, Optimization and Big Data

Machine Learning, Optimization, and Big Data pp 337-346

Semi-Naive Mixture Model for Consensus Clustering

  • Marco Moltisanti
  • Giovanni Maria Farinella
  • Sebastiano Battiato
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9432)

Abstract

Consensus clustering is a powerful method to combine multiple partitions obtained through different runs of clustering algorithms. The goal is to achieve a robust and stable partition of the space through a consensus procedure which exploits the diversity of multiple clusterings outputs. Several methods have been proposed to tackle the consensus clustering problem. Among them, the algorithm which models the problem as a mixture of multivariate multinomial distributions in the space of cluster labels gained high attention in the literature. However, to make the problem tractable, the theoretical formulation takes into account a Naive Bayesian conditional independence assumption over the components of the vector space in which the consensus function acts (i.e., the conditional probability of a \(d-\)dimensional vector space is represented as the product of conditional probability in an one dimensional feature space). In this paper we propose to relax the aforementioned assumption, heading to a Semi-Naive approach to model some of the dependencies among the components of the vector space for the generation of the final consensus partition. The Semi-Naive approach consists in grouping in a random way the components of the labels space and modeling the conditional density term in the maximum-likelihood estimation formulation as the product of the conditional densities of the finite set of groups composed by elements of the labels space. Experiments are performed to point out the results of the proposed approach.

References

  1. 1.
    Battiato, S., Farinella, G.M., Gallo, G., Ravì, D.: Exploiting textons distributions on spatial hierarchy for scene classification. J. Image Video Process. 2010(7), 1–13 (2010)Google Scholar
  2. 2.
    Battiato, S., Farinella, G.M., Guarnera, M., Messina, G., Ravì, D.: Red-eyes removal through cluster based linear discriminant analysis. In: 2010 17th IEEE International Conference on Image Processing (ICIP), pp. 2185–2188. IEEE (2010)Google Scholar
  3. 3.
    Battiato, S., Farinella, G.M., Puglisi, G., Ravì, D.: Aligning codebooks for near duplicate image detection. Multimedia Tools Appl. 72(2), 1483–1506 (2014)CrossRefGoogle Scholar
  4. 4.
    Estivill-Castro, V.: Why so many clustering algorithms: a position paper. ACM SIGKDD Explor. Newsl. 4(1), 65–75 (2002)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Farinella, G.M., Moltisanti, M., Battiato, S.: Classifying food images represented as bag of textons. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 5212–5216 (2014)Google Scholar
  6. 6.
    Farinella, G.M., Moltisanti, M., Battiato, S.: Food recognition using consensus vocabularies. In: Murino, V., Puppo, E., Sona, D., Cristani, M., Sansone, C. (eds.) ICIAP 2015 Workshops. LNCS, vol. 9281, pp. 384–392. Springer, Heidelberg (2015) CrossRefGoogle Scholar
  7. 7.
    Fred, A., Jain, A.K.: Robust data clustering. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2–128. IEEE (2003)Google Scholar
  8. 8.
    Fred, A., Jain, A.K.: Data clustering using evidence accumulation. In: International Conference on Pattern Recognition, vol. 4, pp. 276–280. IEEE (2002)Google Scholar
  9. 9.
    Fred, A., Jain, A.K.: Evidence accumulation clustering based on the K-means algorithm. In: Caelli, T.M., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, pp. 442–451. Springer, Heidelberg (2002) CrossRefGoogle Scholar
  10. 10.
    Ghaemi, R., Sulaiman, N., Ibrahim, H., Mustapha, N.: A survey: clustering ensembles techniques. Eng. Technol. 38(February), 636–645 (2009)Google Scholar
  11. 11.
    Karypis, G., Aggarwal, R., Kumar, V., Shekhar, S.: Multilevel hypergraph partitioning: applications in VLSI domain. IEEE Trans. Very Large Scale Integr. VLSI Syst. 7(1), 69–79 (1999)CrossRefGoogle Scholar
  12. 12.
    Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Kleinberg, J.: An impossibility theorem for clustering. In: Advances in Neural Information Processing Systems, pp. 446–453 (2002)Google Scholar
  14. 14.
    Kononenko, I.: Semi-naive bayesian classifier. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482. Springer, Heidelberg (1991) Google Scholar
  15. 15.
    Özuysal, M., Calonder, M., Lepetit, V., Fua, P.: Fast keypoint recognition using random ferns. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 448–461 (2010)CrossRefGoogle Scholar
  16. 16.
    Pazzani, M.J.: Constructive induction of cartesian product attributes. In: Feature Extraction, Construction and Selection, pp. 341–354. Springer (1998)Google Scholar
  17. 17.
    Saffari, A., Bischof., H.: Clustering in a boosting framework. In: Proceedings of Computer Vision Winter Workshop (CVWW), St. Lambrecht, Austria, pp. 75–82 (2007)Google Scholar
  18. 18.
    Strehl, A., Ghosh, J.: Cluster ensembles–a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2003)MATHMathSciNetGoogle Scholar
  19. 19.
    Topchy, A., Jain, A.K., Punch, W.: Clustering ensembles: models of consensus and weak partitions. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1866–1881 (2005)CrossRefGoogle Scholar
  20. 20.
    Zheng, F., Webb, G.: A comparative study of semi-naive bayes methods in classification learning. In: Proceedings of the 4th Australasian Data Mining Conference (AusDM 2005), pp. 141–156 (2005)Google Scholar
  21. 21.
    Zheng, Z., Webb, G.I., Ting, K.M.: Lazy bayesian rules: a lazy semi-naive bayesian learning technique competitive to boosting decision trees. In: Proceedings of the 16th International Conference on Machine Learning (1999)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Marco Moltisanti
    • 1
  • Giovanni Maria Farinella
    • 1
  • Sebastiano Battiato
    • 1
  1. 1.Image Processing Laboratory – Dipartimento di Matematica e InformaticaUniversità degli Studi di CataniaCataniaItaly

Personalised recommendations