Bayesian inference by reversible jump MCMC for clustering based on finite generalized inverted Dirichlet mixtures

Bourouis, Sami; Al-Osaimi, Faisal R.; Bouguila, Nizar; Sallay , Hassen; Aldosari, Fahd; Al Mashrgy, Mohamed

doi:10.1007/s00500-018-3244-4

Bayesian inference by reversible jump MCMC for clustering based on finite generalized inverted Dirichlet mixtures

Methodologies and Application
Published: 23 May 2018

Volume 23, pages 5799–5813, (2019)
Cite this article

Soft Computing Aims and scope Submit manuscript

Sami Bourouis¹,
Faisal R. Al-Osaimi²,
Nizar Bouguila ORCID: orcid.org/0000-0001-7224-7940³,
Hassen Sallay ⁴,
Fahd Aldosari⁴ &
…
Mohamed Al Mashrgy³

516 Accesses
17 Citations
Explore all metrics

Abstract

The goal of constructing models from examples has been approached from different perspectives. Statistical methods have been widely used and proved effective in generating accurate models. Finite Gaussian mixture models have been widely used to describe a wide variety of random phenomena and have played a prominent role in many attempts to develop expressive statistical models in machine learning. However, their effectiveness is limited to applications where underlying modeling assumptions (e.g., the per-components densities are Gaussian) are reasonably satisfied. Thus, much research efforts have been devoted to developing better alternatives. In this paper, we focus on constructing statistical models from positive vectors (i.e., vectors whose elements are strictly greater than zero) for which the generalized inverted Dirichlet (GID) mixture has been shown to be a flexible and powerful parametric framework. In particular, we propose a Bayesian density estimation method based upon mixtures of GIDs. The consideration of Bayesian learning is interesting in several respects. It allows to take uncertainty into account by introducing prior information about the parameters, it allows simultaneous parameters estimation and model selection, and it allows to overcome learning problems related to over- or under-fitting. Indeed, we develop a reversible jump Markov Chain Monte Carlo sampler for GID mixtures that we apply for simultaneous clustering and feature selection in the context of some challenging real-world applications concerning scene classification, action recognition, and video forgery detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

A Fully Bayesian Framework for Positive Data Clustering

Model-based clustering based on sparse finite Gaussian mixtures

Article Open access 26 August 2014

Variational Learning of Finite Inverted Dirichlet Mixture Models and Applications

Notes

The source code of cuboid detector is at: http://vision.ucsd.edu/~ pdollar.
Data set is available at: http://vision.eecs.ucf.edu/datasetsActions.html.

References

Aggarwal JK, Cai Q (1999) Human motion analysis: a review. Comput Vis Image Underst 73(3):428–440
Article Google Scholar
Allili MS, Bouguila N, Ziou D (2007) Finite generalized Gaussian mixture modeling and applications to image and video foreground segmentation. In: Proceedings of the fourth canadian conference on computer and robot vision (CRV), pp 183–190
Baldi P, Long AD (2001) A bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes. Bioinformatics 17(6):509–519
Article Google Scholar
Bao SYZ, Sun M, Savarese S (2010) Toward coherent object detection and scene layout understanding. In: Proceedings of the EEE computer society conference on computer vision and pattern recognition (CVPR), pp 65–72
Bdiri T, Bouguila N (2012) Positive vectors clustering using inverted dirichlet finite mixture models. Expert Syst Appl 39(2):1869–1882
Article Google Scholar
BenAbdelkader C, Cutler RG, Davis LS (2004) Gait recognition using image self-similarity. EURASIP J Appl Signal Process 2004:572–585
Google Scholar
Bickel PJ, Levina E (2004) Some theory for Fisher’s linear discriminant function, ‘naive Bayes’, and some alternatives when there are many more variables than observations. Bernoulli 10(6):989–1010
Article MathSciNet MATH Google Scholar
Bobick A, Davis J (2001) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):257–267
Article Google Scholar
Bong CW, Rajeswari M (2011) Multi-objective nature-inspired clustering and classification techniques for image segmentation. Appl Soft Comput 11(4):3271–3282
Article Google Scholar
Bouguila N (2007) Spatial color image databases summarization. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, ICASSP, pp 953–956
Bouguila N (2011) Bayesian hybrid generative discriminative learning based on finite liouville mixture models. Pattern Recognit 44(6):1183–1200
Article MATH Google Scholar
Bouguila N, Ziou D, Hammoud RI (2009) On bayesian analysis of a finite generalized dirichlet mixture via a metropolis-within-gibbs sampling. Pattern Anal Appl 12(2):151–166
Article MathSciNet Google Scholar
Bourouis S, Mashrgy MA, Bouguila N (2014) Bayesian learning of finite generalized inverted dirichlet mixtures: application to object classification and forgery detection. Expert Syst Appl 41(5):2329–2336
Article Google Scholar
Bouveyron C, Brunet C (2012) Simultaneous model-based clustering and visualization in the fisher discriminative subspace. Stat Comput 22(1):301–324
Article MathSciNet MATH Google Scholar
Cabral CRB, Bolfarine H, Pereira JRG (2008) Bayesian density estimation using skew student-t-normal mixtures. Comput Stat Data Anal 52(12):5075–5090
Article MathSciNet MATH Google Scholar
Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 333–342
Chang S, Yan C, Dimitroff D, Arndt T (1988) An intelligent image database system. IEEE Trans Softw Eng 14(5):681–688
Article Google Scholar
Chen C (2014) Feature selection based on compactness and separability: comparison with filter-based methods. Comput Intell 30(3):636–656
Article MathSciNet Google Scholar
Chib S, Winkelmann R (2001) Markov chain Monte Carlo analysis of correlated count data. J Bus Econ Stat 19(4):428–435
Article MathSciNet Google Scholar
Chomat O, Crowley J (1999) Probabilistic recognition of activity using local appearance. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, pp 104–109
Cohen WW, Richman J (2002) Learning to match and cluster large high-dimensional data sets for data integration. In: Proceedings of the Eighth ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 475–480
Crandall DJ, Backstrom L, Huttenlocher DP, Kleinberg JM (2009) Mapping the world’s photos. In: Proceedings of the 18th international conference on world wide web (WWW), ACM, pp 761–770
Das S, Konar A (2009) Automatic image pixel clustering with an improved differential evolution. Appl Soft Comput 9(1):226–236
Article Google Scholar
Davis J, Bobick A (1997) The representation and recognition of human movement using temporal templates. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 928–934
Dias JG, Wedel M (2004) An empirical comparison of em, SEM and MCMC performance for problematic gaussian mixture likelihoods. Stat Comput 14(4):323–332
Article MathSciNet Google Scholar
Dollar P, Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features. In: Proceedings of IEEE international workshop on visual surveillance and performance evaluation of tracking and surveillance (VS-PETS), pp 65 – 72
Duan L, Xu D, Tsang IWH, Luo J (2012) Visual event recognition in videos by learning from web data. IEEE Trans Pattern Anal Mach Intell 34(9):1667–1680
Article Google Scholar
Duygulu P, Barnard K, de Freitas JFG, Forsyth DA (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Heyden A, Sparr G, Nielsen M, Johansen P (eds) ECCV (4), Lecture notes in computer science, vol 2353. Springer, pp 97–112
García JM, Benitez LR, Fernández-Caballero A, López MT (2010) Video sequence motion tracking by fuzzification techniques. Appl Soft Comput 10(1):318–331
Article Google Scholar
Geiger D, Heckerman D, King H, Meek C (2001) Stratified exponential families: graphical models and model selection. Ann Stat 29(2):505–529
Article MathSciNet MATH Google Scholar
Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7(4):457–472 (with discussion)
Article MATH Google Scholar
Gokcay E, Príncipe JC (2002) Information theoretic clustering. IEEE Trans Pattern Anal Mach Intell 24(2):158–171
Article Google Scholar
Gondra I, Heisterkamp DR (2008) Content-based image retrieval with the normalized information distance. Comput Vis Image Underst 111(2):219–228
Article Google Scholar
Guha S, Rastogi R, Shim K (1998) Cure: an efficient clustering algorithm for large databases. In: Haas LM, Tiwary A (eds) SIGMOD conference. ACM Press, pp 73–84
Guo X, Cao X, Zhang J, Li X (2009) Mift: a mirror reflection invariant feature descriptor. In: Zha H, ichiro Taniguchi R, Maybank SJ (eds) ACCV (2), Lecture notes in computer science, vol 5995. Springer, pp 536–545
Hadjidemetriou E, Grossberg MD, Nayar SK (2004) Multiresolution histograms and their use for recognition. IEEE Trans Pattern Anal Mach Intell 26(7):831–847
Article Google Scholar
Hajji H (2005) Statistical analysis of network traffic for adaptive faults detection. IEEE Trans Neural Netw 16(5):1053–1063
Article Google Scholar
He X, Ji M, Zhang C, Bao H (2011) A variance minimization criterion to feature selection using laplacian regularization. IEEE Trans Pattern Anal Mach Intell 33(10):2013–2025
Article Google Scholar
Heitz G, Koller D (2008) Learning spatial context: using stuff to find things. In: Forsyth DA, Torr PHS, Zisserman A (eds) ECCV (1), Lecture notes in computer science, vol 5302. Springer, pp 30–43
Hinton G (1999) Products of experts. In: Proceedings of the ninth international conference on artificial neural networks (ICANN), vol 1. IEEE, pp 1–6
Ho RKW, Hu I (2008) Flexible modelling of random effects in linear mixed models—a bayesian approach. Comput Stat Data Anal 52(3):1347–1361
Article MathSciNet MATH Google Scholar
Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1/2):177–196
Article MATH Google Scholar
Hsu CC, Hung TY, Lin CW, Hsu CT (2008) Video forgery detection using correlation of noise residue. In: 2008 IEEE 10th workshop on multimedia signal processing, pp 170–174
Jasra A, Stephens DA, Holmes CC (2007) Population-based reversible jump Markov chain Monte Carlo. Biometrika 94(4):787–807
Article MathSciNet MATH Google Scholar
Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892
Article MATH Google Scholar
Karthikeyan M, Aruna P (2013) Probability based document clustering and image clustering using content-based image retrieval. Appl Soft Comput 13(2):959–966
Article Google Scholar
Kato Z (2008) Segmentation of color images via reversible jump MCMC sampling. Image Vis Comput 26(3):361–371
Article Google Scholar
Kobayashi M, Okabe T, Sato Y (2010) Detecting forgery from static-scene video based on inconsistency in noise level functions. IEEE Trans Inf Forensics Secur 5(4):883–892
Article Google Scholar
Laptev I, Lindeberg T (2004) Velocity adaptation of space-time interest points. In: Proceedings of the 17th international conference on pattern recognition (ICPR), vol 1, pp 52–56
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, pp 2169–2178
Leibe B, Seemann E, Schiele B (2005) Pedestrian detection in crowded scenes. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 1. IEEE Computer Society, pp 878–885
Lienhart R, Kuranov A, Pisarevsky V (2003) Empirical analysis of detection cascades of boosted classifiers for rapid object detection. In: Michaelis B, Krell G (eds) DAGM-symposium, Lecture notes in computer science, vol 2781. Springer, pp 297–304
Lin TI, Lee JC (2007) Bayesian analysis of hierarchical linear mixed modeling using the multivariate t distribution. J Stat Plan Inference 137(2):484–495
Article MathSciNet MATH Google Scholar
Liu JS, Liang F, Wong WH (2000) The multiple-try method and local optimization in Metropolis sampling. J Am Stat Assoc 95(449):121–134
Article MathSciNet MATH Google Scholar
Liu D, Lam K, Shen L (2004) Optimal sampling of gabor features for face recognition. Pattern Recognit Lett 25(2):267–276
Article Google Scholar
Liu X, He GF, Peng SJ, Cheung YM, Tang YY (2017) Efficient human motion retrieval via temporal adjacent bag of words and discriminative neighborhood preserving dictionary learning. IEEE Trans Hum Mach Syst 47(6):763–776
Article Google Scholar
Law MHC, Figueiredo MAT, Jain AK (2004) Simultaneous feature selection and clustering using mixture models. IEEE Trans Pattern Anal Mach Intell 26(9):1154–1166
Article Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Mamat R, Herawan T, Deris MM (2013) Mar: Maximum attribute relative of soft set for clustering attribute selection. Knowl Based Syst 52:11–20
Article Google Scholar
Maree R, Geurts P, Piater J, Wehenkel L (2005) Random subwindows for robust image classification. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 1, pp 34–40
Mashrgy MA, Bdiri T, Bouguila N (2014) Robust simultaneous positive data clustering and unsupervised feature selection using generalized inverted dirichlet mixture models. Knowl Based Syst 59:182–195
Article Google Scholar
McLachlan G, Khan N (2004) On a resampling approach for tests on the number of clusters with mixture model-based clustering of tissue samples. J Multivar Anal 90(1):90–105
Article MathSciNet MATH Google Scholar
McLachlan G, Peel D, Bean R (2003) Modelling high-dimensional data by mixtures of factor analyzers. Comput Stat Data Anal 41:379–388
Article MathSciNet MATH Google Scholar
Meila M (2007) Comparing clusterings—an information based distance. J Multivar Anal 98(5):873–895
Article MathSciNet MATH Google Scholar
Mishra NS, Ghosh S, Ghosh A (2012) Fuzzy clustering algorithms incorporating local information for change detection in remotely sensed images. Appl Soft Comput 12(8):2683–2692
Article Google Scholar
Mosleh A, Bouguila N, Hamza AB (2012) Video completion using bandlet transform. IEEE Trans Multimed 14(6):1591–1601
Article Google Scholar
Neal RM (2003) Slice sampling. Ann Stat 31(3):705–767
Article MathSciNet MATH Google Scholar
Pandey M, Lazebnik S (2011) Scene recognition and weakly supervised object localization with deformable part-based models. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 1307–1314
Pizzuti C, Talia D (2003) P-autoclass: scalable parallel clustering for mining large data sets. IEEE Trans Knowl Data Eng 15(3):629–641
Article Google Scholar
Quack T, Mönich U, Thiele L, Manjunath BS (2004) Cortina: a system for large-scale, content-based web image retrieval. In: Proceedings of the 12th ACM international conference on multimedia (MM). ACM, pp 508–511
Quelhas P, Monay F, Odobez JM, Gatica-Perez D, Tuytelaars T (2007) A thousand words in a scene. IEEE Trans Pattern Anal Mach Intell 29(9):1575–1589
Article Google Scholar
Rao C, Shah M (2001) View-invariance in action recognition. In: Proc. of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2. IEEE Computer Society, pp 316–322
Ren Y, Liu X, Liu W (2012) Dbcamm: a novel density based clustering algorithm via using the mahalanobis metric. Appl Soft Comput 12(5):1542–1554
Article MathSciNet Google Scholar
Richardson S, Green PJ (1997) On bayesian analysis of mixtures with an unknown number of components. J R Stat Soc Ser B 59(4):731–792 (with discussion)
Article MathSciNet MATH Google Scholar
Rodriguez M, Ahmed J, Shah M (2008) Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
Rufo M, Martn J, Prez C (2006) Bayesian analysis of finite mixture models of distributions from exponential families. Comput Stat 21(3–4):621–637
Article MathSciNet Google Scholar
Ruta A, Porikli F (2012) Compressive clustering of high-dimensional data. In: Proceedings of the 11th international conference on machine learning and applications, (ICMLA), pp 380–385
Schiele B, Crowley JL (2000) Recognition without correspondence using multidimensional receptive field histograms. Int J Comput Vis 36(1):31–50
Article Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Article MathSciNet MATH Google Scholar
Shen L, Bai L (2006) Mutualboost learning for selecting gabor features for face recognition. Pattern Recognit Lett 27(15):1758–1767
Article Google Scholar
Tan M (1993) Cost-sensitive learning of classification knowledge and its applications in robotics. Mach Learn 13(1):7–33
MathSciNet Google Scholar
Tu Z, Zhu SC (2002) Image segmentation by data-driven markov chain monte carlo. IEEE Trans Pattern Anal Mach Intell 24(5):657–673
Article Google Scholar
Vlassis N, Likas A (1999) A kurtosis-based dynamic approach to gaussian mixture modeling. IEEE Trans Syst Man Cybern Part A Syst Hum 29(4):393–399
Article Google Scholar
Vlassis N, Papakonstantinou G, Tsanakas P (1999) Mixture density estimation based on maximum likelihood and sequential test statistics. Neural Process Lett 9(1):63–76
Article Google Scholar
Wang W, Farid H (2007a) Exposing digital forgeries in interlaced and deinterlaced video. IEEE Trans Inf Forensics Secur 2(3):438–449
Wang W, Farid H (2007b) Exposing digital forgeries in video by detecting duplication. In: Proceedings of the 9th workshop on multimedia and security. ACM, New York, NY, USA, pp 35–42
Wang Y, Zhu SC (2004) Analysis and synthesis of textured motion: particles and waves. IEEE Trans Pattern Anal Mach Intell 26(10):1348–1363
Article Google Scholar
Xu D, Tian Y (2015) A comprehensive survey of clustering algorithms. Ann Data Sci 2(2):165–193
Article MathSciNet Google Scholar
Xu D, Xu Z, Liu S, Zhao H (2013) A spectral clustering algorithm based on intuitionistic fuzzy information. Knowl Based Syst 53:20–26
Article Google Scholar
Zahn C (1971) Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput C–20(1):68–86
Article MATH Google Scholar
Zelnik-Manor L, Irani M (2001) Event-based analysis of video. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, pp II–123–II–130
Zhang Z, Chan KL, Wu Y, Chen C (2004) Learning a multivariate gaussian mixture model with the reversible jump MCMC algorithm. Stat Comput 14(4):343–355
Article MathSciNet Google Scholar
Zhang B, Shan S, Chen X, Gao W (2007) Histogram of gabor phase patterns (hgpp): a novel object representation approach for face recognition. IEEE Trans Image Process 16(1):57–68
Article MathSciNet Google Scholar
Zhang S, Wei Z, Nie J, Huang L, Wang S, Li Z (2017) A review on human activity recognition using vision-based method. J Healthc Eng (Article ID 3090343)
Zhao P, Zhang CQ (2011) A new clustering method and its application in social networks. Pattern Recognit Lett 32(15):2109–2118
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank Umm al-Qura University, Kingdom of Saudi Arabia, for their funding support under Grant Number 15-COM-3-1-0007.

Author information

Authors and Affiliations

Department of Information Technology, College of Computers and Information Technology, Taif university, Taif, Kingdom of Saudi Arabia
Sami Bourouis
Department of Computer Engineering, College of Computer Systems, Umm Al-Qura University, Mecca, Kingdom of Saudi Arabia
Faisal R. Al-Osaimi
The Concordia Institute for Information Systems Engineering (CIISE), Concordia University, Montreal, QC, Canada
Nizar Bouguila & Mohamed Al Mashrgy
College of Computer and Information Systems, Umm Al-Qura University, Mecca, Kingdom of Saudi Arabia
Hassen Sallay & Fahd Aldosari

Authors

Sami Bourouis
View author publications
You can also search for this author in PubMed Google Scholar
Faisal R. Al-Osaimi
View author publications
You can also search for this author in PubMed Google Scholar
Nizar Bouguila
View author publications
You can also search for this author in PubMed Google Scholar
Hassen Sallay
View author publications
You can also search for this author in PubMed Google Scholar
Fahd Aldosari
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Al Mashrgy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nizar Bouguila.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bourouis, S., Al-Osaimi, F.R., Bouguila, N. et al. Bayesian inference by reversible jump MCMC for clustering based on finite generalized inverted Dirichlet mixtures. Soft Comput 23, 5799–5813 (2019). https://doi.org/10.1007/s00500-018-3244-4

Download citation

Published: 23 May 2018
Issue Date: 01 July 2019
DOI: https://doi.org/10.1007/s00500-018-3244-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian inference by reversible jump MCMC for clustering based on finite generalized inverted Dirichlet mixtures

Abstract

Access this article

Similar content being viewed by others

A Fully Bayesian Framework for Positive Data Clustering

Model-based clustering based on sparse finite Gaussian mixtures

Variational Learning of Finite Inverted Dirichlet Mixture Models and Applications

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bayesian inference by reversible jump MCMC for clustering based on finite generalized inverted Dirichlet mixtures

Abstract

Access this article

Similar content being viewed by others

A Fully Bayesian Framework for Positive Data Clustering

Model-based clustering based on sparse finite Gaussian mixtures

Variational Learning of Finite Inverted Dirichlet Mixture Models and Applications

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation