Skip to main content
Log in

Bayesian inference by reversible jump MCMC for clustering based on finite generalized inverted Dirichlet mixtures

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The goal of constructing models from examples has been approached from different perspectives. Statistical methods have been widely used and proved effective in generating accurate models. Finite Gaussian mixture models have been widely used to describe a wide variety of random phenomena and have played a prominent role in many attempts to develop expressive statistical models in machine learning. However, their effectiveness is limited to applications where underlying modeling assumptions (e.g., the per-components densities are Gaussian) are reasonably satisfied. Thus, much research efforts have been devoted to developing better alternatives. In this paper, we focus on constructing statistical models from positive vectors (i.e., vectors whose elements are strictly greater than zero) for which the generalized inverted Dirichlet (GID) mixture has been shown to be a flexible and powerful parametric framework. In particular, we propose a Bayesian density estimation method based upon mixtures of GIDs. The consideration of Bayesian learning is interesting in several respects. It allows to take uncertainty into account by introducing prior information about the parameters, it allows simultaneous parameters estimation and model selection, and it allows to overcome learning problems related to over- or under-fitting. Indeed, we develop a reversible jump Markov Chain Monte Carlo sampler for GID mixtures that we apply for simultaneous clustering and feature selection in the context of some challenging real-world applications concerning scene classification, action recognition, and video forgery detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. The source code of cuboid detector is at: http://vision.ucsd.edu/~ pdollar.

  2. Data set is available at: http://vision.eecs.ucf.edu/datasetsActions.html.

References

  • Aggarwal JK, Cai Q (1999) Human motion analysis: a review. Comput Vis Image Underst 73(3):428–440

    Article  Google Scholar 

  • Allili MS, Bouguila N, Ziou D (2007) Finite generalized Gaussian mixture modeling and applications to image and video foreground segmentation. In: Proceedings of the fourth canadian conference on computer and robot vision (CRV), pp 183–190

  • Baldi P, Long AD (2001) A bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes. Bioinformatics 17(6):509–519

    Article  Google Scholar 

  • Bao SYZ, Sun M, Savarese S (2010) Toward coherent object detection and scene layout understanding. In: Proceedings of the EEE computer society conference on computer vision and pattern recognition (CVPR), pp 65–72

  • Bdiri T, Bouguila N (2012) Positive vectors clustering using inverted dirichlet finite mixture models. Expert Syst Appl 39(2):1869–1882

    Article  Google Scholar 

  • BenAbdelkader C, Cutler RG, Davis LS (2004) Gait recognition using image self-similarity. EURASIP J Appl Signal Process 2004:572–585

    Google Scholar 

  • Bickel PJ, Levina E (2004) Some theory for Fisher’s linear discriminant function, ‘naive Bayes’, and some alternatives when there are many more variables than observations. Bernoulli 10(6):989–1010

    Article  MathSciNet  MATH  Google Scholar 

  • Bobick A, Davis J (2001) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):257–267

    Article  Google Scholar 

  • Bong CW, Rajeswari M (2011) Multi-objective nature-inspired clustering and classification techniques for image segmentation. Appl Soft Comput 11(4):3271–3282

    Article  Google Scholar 

  • Bouguila N (2007) Spatial color image databases summarization. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, ICASSP, pp 953–956

  • Bouguila N (2011) Bayesian hybrid generative discriminative learning based on finite liouville mixture models. Pattern Recognit 44(6):1183–1200

    Article  MATH  Google Scholar 

  • Bouguila N, Ziou D, Hammoud RI (2009) On bayesian analysis of a finite generalized dirichlet mixture via a metropolis-within-gibbs sampling. Pattern Anal Appl 12(2):151–166

    Article  MathSciNet  Google Scholar 

  • Bourouis S, Mashrgy MA, Bouguila N (2014) Bayesian learning of finite generalized inverted dirichlet mixtures: application to object classification and forgery detection. Expert Syst Appl 41(5):2329–2336

    Article  Google Scholar 

  • Bouveyron C, Brunet C (2012) Simultaneous model-based clustering and visualization in the fisher discriminative subspace. Stat Comput 22(1):301–324

    Article  MathSciNet  MATH  Google Scholar 

  • Cabral CRB, Bolfarine H, Pereira JRG (2008) Bayesian density estimation using skew student-t-normal mixtures. Comput Stat Data Anal 52(12):5075–5090

    Article  MathSciNet  MATH  Google Scholar 

  • Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 333–342

  • Chang S, Yan C, Dimitroff D, Arndt T (1988) An intelligent image database system. IEEE Trans Softw Eng 14(5):681–688

    Article  Google Scholar 

  • Chen C (2014) Feature selection based on compactness and separability: comparison with filter-based methods. Comput Intell 30(3):636–656

    Article  MathSciNet  Google Scholar 

  • Chib S, Winkelmann R (2001) Markov chain Monte Carlo analysis of correlated count data. J Bus Econ Stat 19(4):428–435

    Article  MathSciNet  Google Scholar 

  • Chomat O, Crowley J (1999) Probabilistic recognition of activity using local appearance. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, pp 104–109

  • Cohen WW, Richman J (2002) Learning to match and cluster large high-dimensional data sets for data integration. In: Proceedings of the Eighth ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 475–480

  • Crandall DJ, Backstrom L, Huttenlocher DP, Kleinberg JM (2009) Mapping the world’s photos. In: Proceedings of the 18th international conference on world wide web (WWW), ACM, pp 761–770

  • Das S, Konar A (2009) Automatic image pixel clustering with an improved differential evolution. Appl Soft Comput 9(1):226–236

    Article  Google Scholar 

  • Davis J, Bobick A (1997) The representation and recognition of human movement using temporal templates. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 928–934

  • Dias JG, Wedel M (2004) An empirical comparison of em, SEM and MCMC performance for problematic gaussian mixture likelihoods. Stat Comput 14(4):323–332

    Article  MathSciNet  Google Scholar 

  • Dollar P, Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features. In: Proceedings of IEEE international workshop on visual surveillance and performance evaluation of tracking and surveillance (VS-PETS), pp 65 – 72

  • Duan L, Xu D, Tsang IWH, Luo J (2012) Visual event recognition in videos by learning from web data. IEEE Trans Pattern Anal Mach Intell 34(9):1667–1680

    Article  Google Scholar 

  • Duygulu P, Barnard K, de Freitas JFG, Forsyth DA (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Heyden A, Sparr G, Nielsen M, Johansen P (eds) ECCV (4), Lecture notes in computer science, vol 2353. Springer, pp 97–112

  • García JM, Benitez LR, Fernández-Caballero A, López MT (2010) Video sequence motion tracking by fuzzification techniques. Appl Soft Comput 10(1):318–331

    Article  Google Scholar 

  • Geiger D, Heckerman D, King H, Meek C (2001) Stratified exponential families: graphical models and model selection. Ann Stat 29(2):505–529

    Article  MathSciNet  MATH  Google Scholar 

  • Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7(4):457–472 (with discussion)

    Article  MATH  Google Scholar 

  • Gokcay E, Príncipe JC (2002) Information theoretic clustering. IEEE Trans Pattern Anal Mach Intell 24(2):158–171

    Article  Google Scholar 

  • Gondra I, Heisterkamp DR (2008) Content-based image retrieval with the normalized information distance. Comput Vis Image Underst 111(2):219–228

    Article  Google Scholar 

  • Guha S, Rastogi R, Shim K (1998) Cure: an efficient clustering algorithm for large databases. In: Haas LM, Tiwary A (eds) SIGMOD conference. ACM Press, pp 73–84

  • Guo X, Cao X, Zhang J, Li X (2009) Mift: a mirror reflection invariant feature descriptor. In: Zha H, ichiro Taniguchi R, Maybank SJ (eds) ACCV (2), Lecture notes in computer science, vol 5995. Springer, pp 536–545

  • Hadjidemetriou E, Grossberg MD, Nayar SK (2004) Multiresolution histograms and their use for recognition. IEEE Trans Pattern Anal Mach Intell 26(7):831–847

    Article  Google Scholar 

  • Hajji H (2005) Statistical analysis of network traffic for adaptive faults detection. IEEE Trans Neural Netw 16(5):1053–1063

    Article  Google Scholar 

  • He X, Ji M, Zhang C, Bao H (2011) A variance minimization criterion to feature selection using laplacian regularization. IEEE Trans Pattern Anal Mach Intell 33(10):2013–2025

    Article  Google Scholar 

  • Heitz G, Koller D (2008) Learning spatial context: using stuff to find things. In: Forsyth DA, Torr PHS, Zisserman A (eds) ECCV (1), Lecture notes in computer science, vol 5302. Springer, pp 30–43

  • Hinton G (1999) Products of experts. In: Proceedings of the ninth international conference on artificial neural networks (ICANN), vol 1. IEEE, pp 1–6

  • Ho RKW, Hu I (2008) Flexible modelling of random effects in linear mixed models—a bayesian approach. Comput Stat Data Anal 52(3):1347–1361

    Article  MathSciNet  MATH  Google Scholar 

  • Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1/2):177–196

    Article  MATH  Google Scholar 

  • Hsu CC, Hung TY, Lin CW, Hsu CT (2008) Video forgery detection using correlation of noise residue. In: 2008 IEEE 10th workshop on multimedia signal processing, pp 170–174

  • Jasra A, Stephens DA, Holmes CC (2007) Population-based reversible jump Markov chain Monte Carlo. Biometrika 94(4):787–807

    Article  MathSciNet  MATH  Google Scholar 

  • Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892

    Article  MATH  Google Scholar 

  • Karthikeyan M, Aruna P (2013) Probability based document clustering and image clustering using content-based image retrieval. Appl Soft Comput 13(2):959–966

    Article  Google Scholar 

  • Kato Z (2008) Segmentation of color images via reversible jump MCMC sampling. Image Vis Comput 26(3):361–371

    Article  Google Scholar 

  • Kobayashi M, Okabe T, Sato Y (2010) Detecting forgery from static-scene video based on inconsistency in noise level functions. IEEE Trans Inf Forensics Secur 5(4):883–892

    Article  Google Scholar 

  • Laptev I, Lindeberg T (2004) Velocity adaptation of space-time interest points. In: Proceedings of the 17th international conference on pattern recognition (ICPR), vol 1, pp 52–56

  • Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, pp 2169–2178

  • Leibe B, Seemann E, Schiele B (2005) Pedestrian detection in crowded scenes. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 1. IEEE Computer Society, pp 878–885

  • Lienhart R, Kuranov A, Pisarevsky V (2003) Empirical analysis of detection cascades of boosted classifiers for rapid object detection. In: Michaelis B, Krell G (eds) DAGM-symposium, Lecture notes in computer science, vol 2781. Springer, pp 297–304

  • Lin TI, Lee JC (2007) Bayesian analysis of hierarchical linear mixed modeling using the multivariate t distribution. J Stat Plan Inference 137(2):484–495

    Article  MathSciNet  MATH  Google Scholar 

  • Liu JS, Liang F, Wong WH (2000) The multiple-try method and local optimization in Metropolis sampling. J Am Stat Assoc 95(449):121–134

    Article  MathSciNet  MATH  Google Scholar 

  • Liu D, Lam K, Shen L (2004) Optimal sampling of gabor features for face recognition. Pattern Recognit Lett 25(2):267–276

    Article  Google Scholar 

  • Liu X, He GF, Peng SJ, Cheung YM, Tang YY (2017) Efficient human motion retrieval via temporal adjacent bag of words and discriminative neighborhood preserving dictionary learning. IEEE Trans Hum Mach Syst 47(6):763–776

    Article  Google Scholar 

  • Law MHC, Figueiredo MAT, Jain AK (2004) Simultaneous feature selection and clustering using mixture models. IEEE Trans Pattern Anal Mach Intell 26(9):1154–1166

    Article  Google Scholar 

  • Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  • Mamat R, Herawan T, Deris MM (2013) Mar: Maximum attribute relative of soft set for clustering attribute selection. Knowl Based Syst 52:11–20

    Article  Google Scholar 

  • Maree R, Geurts P, Piater J, Wehenkel L (2005) Random subwindows for robust image classification. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 1, pp 34–40

  • Mashrgy MA, Bdiri T, Bouguila N (2014) Robust simultaneous positive data clustering and unsupervised feature selection using generalized inverted dirichlet mixture models. Knowl Based Syst 59:182–195

    Article  Google Scholar 

  • McLachlan G, Khan N (2004) On a resampling approach for tests on the number of clusters with mixture model-based clustering of tissue samples. J Multivar Anal 90(1):90–105

    Article  MathSciNet  MATH  Google Scholar 

  • McLachlan G, Peel D, Bean R (2003) Modelling high-dimensional data by mixtures of factor analyzers. Comput Stat Data Anal 41:379–388

    Article  MathSciNet  MATH  Google Scholar 

  • Meila M (2007) Comparing clusterings—an information based distance. J Multivar Anal 98(5):873–895

    Article  MathSciNet  MATH  Google Scholar 

  • Mishra NS, Ghosh S, Ghosh A (2012) Fuzzy clustering algorithms incorporating local information for change detection in remotely sensed images. Appl Soft Comput 12(8):2683–2692

    Article  Google Scholar 

  • Mosleh A, Bouguila N, Hamza AB (2012) Video completion using bandlet transform. IEEE Trans Multimed 14(6):1591–1601

    Article  Google Scholar 

  • Neal RM (2003) Slice sampling. Ann Stat 31(3):705–767

    Article  MathSciNet  MATH  Google Scholar 

  • Pandey M, Lazebnik S (2011) Scene recognition and weakly supervised object localization with deformable part-based models. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 1307–1314

  • Pizzuti C, Talia D (2003) P-autoclass: scalable parallel clustering for mining large data sets. IEEE Trans Knowl Data Eng 15(3):629–641

    Article  Google Scholar 

  • Quack T, Mönich U, Thiele L, Manjunath BS (2004) Cortina: a system for large-scale, content-based web image retrieval. In: Proceedings of the 12th ACM international conference on multimedia (MM). ACM, pp 508–511

  • Quelhas P, Monay F, Odobez JM, Gatica-Perez D, Tuytelaars T (2007) A thousand words in a scene. IEEE Trans Pattern Anal Mach Intell 29(9):1575–1589

    Article  Google Scholar 

  • Rao C, Shah M (2001) View-invariance in action recognition. In: Proc. of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2. IEEE Computer Society, pp 316–322

  • Ren Y, Liu X, Liu W (2012) Dbcamm: a novel density based clustering algorithm via using the mahalanobis metric. Appl Soft Comput 12(5):1542–1554

    Article  MathSciNet  Google Scholar 

  • Richardson S, Green PJ (1997) On bayesian analysis of mixtures with an unknown number of components. J R Stat Soc Ser B 59(4):731–792 (with discussion)

    Article  MathSciNet  MATH  Google Scholar 

  • Rodriguez M, Ahmed J, Shah M (2008) Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8

  • Rufo M, Martn J, Prez C (2006) Bayesian analysis of finite mixture models of distributions from exponential families. Comput Stat 21(3–4):621–637

    Article  MathSciNet  Google Scholar 

  • Ruta A, Porikli F (2012) Compressive clustering of high-dimensional data. In: Proceedings of the 11th international conference on machine learning and applications, (ICMLA), pp 380–385

  • Schiele B, Crowley JL (2000) Recognition without correspondence using multidimensional receptive field histograms. Int J Comput Vis 36(1):31–50

    Article  Google Scholar 

  • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    Article  MathSciNet  MATH  Google Scholar 

  • Shen L, Bai L (2006) Mutualboost learning for selecting gabor features for face recognition. Pattern Recognit Lett 27(15):1758–1767

    Article  Google Scholar 

  • Tan M (1993) Cost-sensitive learning of classification knowledge and its applications in robotics. Mach Learn 13(1):7–33

    MathSciNet  Google Scholar 

  • Tu Z, Zhu SC (2002) Image segmentation by data-driven markov chain monte carlo. IEEE Trans Pattern Anal Mach Intell 24(5):657–673

    Article  Google Scholar 

  • Vlassis N, Likas A (1999) A kurtosis-based dynamic approach to gaussian mixture modeling. IEEE Trans Syst Man Cybern Part A Syst Hum 29(4):393–399

    Article  Google Scholar 

  • Vlassis N, Papakonstantinou G, Tsanakas P (1999) Mixture density estimation based on maximum likelihood and sequential test statistics. Neural Process Lett 9(1):63–76

    Article  Google Scholar 

  • Wang W, Farid H (2007a) Exposing digital forgeries in interlaced and deinterlaced video. IEEE Trans Inf Forensics Secur 2(3):438–449

  • Wang W, Farid H (2007b) Exposing digital forgeries in video by detecting duplication. In: Proceedings of the 9th workshop on multimedia and security. ACM, New York, NY, USA, pp 35–42

  • Wang Y, Zhu SC (2004) Analysis and synthesis of textured motion: particles and waves. IEEE Trans Pattern Anal Mach Intell 26(10):1348–1363

    Article  Google Scholar 

  • Xu D, Tian Y (2015) A comprehensive survey of clustering algorithms. Ann Data Sci 2(2):165–193

    Article  MathSciNet  Google Scholar 

  • Xu D, Xu Z, Liu S, Zhao H (2013) A spectral clustering algorithm based on intuitionistic fuzzy information. Knowl Based Syst 53:20–26

    Article  Google Scholar 

  • Zahn C (1971) Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput C–20(1):68–86

    Article  MATH  Google Scholar 

  • Zelnik-Manor L, Irani M (2001) Event-based analysis of video. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, pp II–123–II–130

  • Zhang Z, Chan KL, Wu Y, Chen C (2004) Learning a multivariate gaussian mixture model with the reversible jump MCMC algorithm. Stat Comput 14(4):343–355

    Article  MathSciNet  Google Scholar 

  • Zhang B, Shan S, Chen X, Gao W (2007) Histogram of gabor phase patterns (hgpp): a novel object representation approach for face recognition. IEEE Trans Image Process 16(1):57–68

    Article  MathSciNet  Google Scholar 

  • Zhang S, Wei Z, Nie J, Huang L, Wang S, Li Z (2017) A review on human activity recognition using vision-based method. J Healthc Eng (Article ID 3090343)

  • Zhao P, Zhang CQ (2011) A new clustering method and its application in social networks. Pattern Recognit Lett 32(15):2109–2118

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Umm al-Qura University, Kingdom of Saudi Arabia, for their funding support under Grant Number 15-COM-3-1-0007.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nizar Bouguila.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bourouis, S., Al-Osaimi, F.R., Bouguila, N. et al. Bayesian inference by reversible jump MCMC for clustering based on finite generalized inverted Dirichlet mixtures. Soft Comput 23, 5799–5813 (2019). https://doi.org/10.1007/s00500-018-3244-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-018-3244-4

Keywords

Navigation