A Fully Bayesian Framework for Positive Data Clustering

  • Mohamed Al Mashrgy
  • Nizar BouguilaEmail author
Part of the Studies in Computational Intelligence book series (SCI, volume 607)


The main concern with mixture modeling is to describe data in which each observation belongs to one of some number of different groups. Mixtures of distributions provide a flexible and convenient class of models for density estimation and their statistical learning has been studied extensively. In this context, fully Bayesian approaches have been widely adopted for mixture estimation and model selection problems and have shown some effectiveness due to the incorporation of the prior knowledge about the parameters. In this chapter, we propose a fully Bayesian approach for finite generalized Inverted Dirichlet (GID) mixture model learning using a reversible jump Markov chain Monte Carlo (RJMCMC) approach [23]. RJMCMC enables us to deal simultaneously with model selection and parameters estimation in one single algorithm. The merits of RJMCMC for GID mixture learning is investigated using synthetic data and a real interesting application namely object detection.


Mixture Model Markov Chain Monte Carlo Gibbs Sampling Proposal Distribution Acceptance Probability 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The completion of this research was made possible thanks to the Natural Sciences and Engineering Research Council of Canada (NSERC).


  1. 1.
    Bouguila, N.: Spatial color image databases summarization. In: The IEEE International Conference on Acoustics. Speech, and Signal Processing (ICASSP), pp. 953–956. Honolulu, HI, Apr 2007Google Scholar
  2. 2.
    Bouguila, N., Daoudi, K.: Learning concepts from visual scenes using a binary probabilistic model. In: Proceedings of the IEEE International Workshop on Multimedia Signal Processing (MMSP), pp. 1–5 (2009)Google Scholar
  3. 3.
    Bouguila, N., Wang, J.H., Hamza, A.B.: Software modules categorization through likelihood and bayesian analysis of finite dirichlet mixtures. J. Appl. Stat. 37(2), 235–252 (2010)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Bouguila, N., Ziou, D.: Dirichlet-based probability model applied to human skin detection. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 521–524 (2004)Google Scholar
  5. 5.
    Bouguila, N., Ziou, D.: A powerful finite mixture model based on the generalized Dirichlet distribution: Unsupervised learning and applications. In: Proceedings of the 17th International Conference on Pattern Recognition (ICPR), pp. 280–283 (2004)Google Scholar
  6. 6.
    Bouguila, N., Ziou, D.: Improving content based image retrieval systems using finite multinomial dirichlet mixture. In: The IEEE Workshop on Machine Learning for Signal Processing (MLSP), pp. 23–32. Sao Luis, Brazil, Oct 2004Google Scholar
  7. 7.
    Bouguila, N., Elguebaly, T.: A fully bayesian model based on reversible jump MCMC and finite beta mixtures for clustering. Expert Syst. Appl. 39(5), 5946–5959 (2012)CrossRefGoogle Scholar
  8. 8.
    Bouguila, N., ElGuebaly, W.: A statistical model for histogram refinement. In: Kurková, V., Neruda, R., Koutník, J. (eds.) Artificial Neural Networks - ICANN 2008, 18th International Conference, Prague, Czech Republic, September 3–6, 2008, Proceedings, Part I. Lecture Notes in Computer Science, vol. 5163, pp. 837–846. Springer (2008)Google Scholar
  9. 9.
    Bouguila, N., ElGuebaly, W.: Discrete data clustering using finite mixture models. Pattern Recognit. 42(1), 33–42 (2009)CrossRefzbMATHGoogle Scholar
  10. 10.
    Bouguila, N., Ziou, D.: Online clustering via finite mixtures of dirichlet and minimum message length. Eng. Appl. Artif. Intell. 19(4), 371–379 (2006)CrossRefGoogle Scholar
  11. 11.
    Bouguila, N., Ziou, D., Hammoud, R.I.: On bayesian analysis of a finite generalized dirichlet mixture via a metropolis-within-gibbs sampling. Pattern Anal. Appl. 12(2), 151–166 (2009)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Casella, G., George, E.I.: Explaining the gibbs sampler. Am. Stat. 46(3), 167–174 (1992)MathSciNetGoogle Scholar
  13. 13.
    Chib, S., Greenberg, E.: Understanding the metropolis-hastings algorithm. Am. Stat. 49(4), 327–335 (1995)Google Scholar
  14. 14.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 886–893 (2005)Google Scholar
  15. 15.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J. Roy. Stat. Soc. B 39(1), 1–38 (1977)MathSciNetGoogle Scholar
  16. 16.
    Elguebaly, T., Bouguila, N.: Bayesian learning of finite generalized gaussian mixture models on images. Sig. Proc. 91(4), 801–820 (2011)CrossRefzbMATHGoogle Scholar
  17. 17.
    Elguebaly, T., Bouguila, N.: A bayesian approach for the classification of mammographic masses. In: Sixth International Conference on Developments in eSystems Engineering (DeSE), 2013, pp. 99–104, Dec 2013Google Scholar
  18. 18.
    Green, P.J.: Reversible jump markov chain monte carlo computation and bayesian model determination. Biometrika 82, 711–732 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Mashrgy, M.A., Bdiri, T., Bouguila, N.: Robust simultaneous positive data clustering and unsupervised feature selection using generalized inverted dirichlet mixture models. Knowl.-Based Syst. 59, 182–195 (2014)CrossRefGoogle Scholar
  20. 20.
    Mashrgy, M.A., Bouguila, N., Daoudi, K.: A statistical framework for positive data clustering with feature selection: Application to object detection. In: 21st European Signal Processing Conference, EUSIPCO 2013, Marrakech, Morocco, September 9–13, 2013, pp. 1–5 (2013)Google Scholar
  21. 21.
    McLachlan, G., Krishnan, T.: The EM Algorithm and Extensions. Wiley Series in Probability and Statistics, 2nd edn. Wiley, Hoboken (2008)Google Scholar
  22. 22.
    McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley Series in Probability and Statistics. Wiley, New York (2000)Google Scholar
  23. 23.
    Richardson, S., Green, P.J.: On Bayesian analysis of mixtures with an unknown number of components. J. Roy. Stat. Soc. B 59(4), 731–792 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Robert, C.P.: The Bayesian Choice: From Decision-Theoretic Foundations to Computational Implementation, 2nd edn. Springer, Berlin (2007)Google Scholar
  25. 25.
    Rocha, A., Goldenstein, S.: PR: More than meets the eye. In: Proceedings of the IEEE 11th International Conference on Computer Vision (ICCV), pp. 1–8 (2007)Google Scholar
  26. 26.
    Schnatter, F.S.: Finite mixture and Markov switching models. Springer, New York (2006)Google Scholar
  27. 27.
    Zhang, Z., Chan, K., Wu, Y., Chen, C.: Learning a multivariate gaussian mixture model with the reversible jump mcmc algorithm. Stat. Comput. 14(4), 343–355 (2004)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Electrical and Computer EngineeringConcordia UniversityMontrealCanada
  2. 2.Concordia Institute for Information Systems EngineeringConcordia UniversityMontrealCanada

Personalised recommendations