Abstract
We propose in this paper to recognize human activities through an unsupervised learning of finite multivariate generalized Gaussian mixture model. We address an important cue in finite mixture model which is the estimation of the mixture model’s parameters for a full covariance matrix. We have developed a novel learning algorithm based on Fixed-point covariance matrix estimator combined with the Expectation-Maximization algorithm. Furthermore, we have proposed an appropriate minimum message length (MML) criterion to deal with model selection problem. We evaluated our proposed method on synthetic datasets and a challenging application namely : Human activity recognition from images and videos. The obtained resutls show clearly the merits of our proposed framework which has better capabilities with full covariance matrix when modeling correlated data.
Similar content being viewed by others
References
Agusta Y, Dowe DL (2003) Unsupervised learning of correlated multivariate gaussian mixture models using mml. In: Australasian joint conference on artificial intelligence. Springer, pp 477–489
Baxter RA, Oliver JJ (2000) Finding overlapping components with mml. Stat Comput 10(1):5–16
Bosch A, Zisserman A, Muñoz X (2006) Scene classification via plsa. Computer Vision–ECCV 2006:517–530
Bouguila N, Ziou D (2007) High-dimensional unsupervised selection and estimation of a finite generalized Dirichlet mixture model based on minimum message length. IEEE Trans Pattern Anal Mach Intell 29(10):1716–1731
Bruno B, Mastrogiovanni F, Sgorbissa A, Vernazza T, Zaccaria R (2012) Human motion modelling and recognition: a computational approach. In: 2012 IEEE international conference on automation science and engineering (CASE). IEEE, pp 156–161
Calderara S, Cucchiara R, Prati A (2007) Detection of abnormal behaviors using a mixture of von mises distributions. In: IEEE conference on advanced video and signal based surveillance, 2007. AVSS 2007. IEEE, pp 141–146
Channoufi I, Bourouis S, Bouguila N, Hamrouni K (2018) Image and video denoising by combining unsupervised bounded generalized gaussian mixture modeling and spatial information. Multimed Tools Appl 77:1–16
Chong W, Blei D, Li FF (2009) Simultaneous image classification and annotation. In: IEEE conference on computer vision and pattern recognition, 2009. CVPR 2009. IEEE, pp 1903–1910
Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, ECCV, vol 1. Prague, pp 1–2
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B Methodol 39:1–38
Dollár P., Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features. In: 2nd joint IEEE international workshop on visual surveillance and performance evaluation of tracking and surveillance, 2005. IEEE, pp 65–72
Elguebaly T, Bouguila N (2015) Semantic scene classification with generalized gaussian mixture models. In: International conference image analysis and recognition. Springer, pp 159–166
Elguebaly T, Bouguila N (2015) Simultaneous high-dimensional clustering and feature selection using asymmetric gaussian mixture models. Image Vis Comput 34:27–41
Fan W, Bouguila N (2014) Variational learning for dirichlet process mixtures of dirichlet distributions and applications. Multimed Tools Appl 70(3):1685–1702
Iosifidis A, Tefas A, Pitas I (2014) Human action recognition based on bag of features and multi-view neural networks. In: 2014 IEEE international conference on image processing (ICIP). IEEE, pp 1510–1514
Kasarapu P, Allison L (2015) Minimum message length estimation of mixtures of multivariate gaussian and von mises-fisher distributions. Mach Learn 100(2-3):333–378
Kelker D (1970) Distribution theory of spherical distributions and a location-scale parameter generalization. Sankhyā: The Indian Journal of Statistics, Series A: 419–430
Kotz S (1975) Multivariate distributions at a cross-road. Statistical Distributions in Scientific Work 1:247–270
Laptev I (2005) On space-time interest points. Int J Comput Vis 64(2-3):107–123
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Null. IEEE, pp 2169–2178
Li LJ, Fei-Fei L (2007) What, where and who? Classifying events by scene and object recognition. In: IEEE 11th international conference on computer vision, 2007. ICCV 2007. IEEE, pp 1–8
Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015) Action2activity: recognizing complex activities from sensor data. In: IJCAI, vol 2015, pp 1617–1623
Liu L, Cheng L, Liu Y, Jia Y, Rosenblum DS (2016) Recognizing complex activities by a probabilistic interval-based model. In: AAAI, vol 30, pp 1266–1272
Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115
Najar F, Bourouis S, Bouguila N, Belguith S (2017) A comparison between different gaussian-based mixture models. In: 14th IEEE international conference on computer systems and applications. IEEE, Tunisia
Najar F, Bourouis S, Bouguila N, Belghith S (2018) A fixed-point estimation algorithm for learning the multivariate ggmm: application to human action recognition. Accepted, to be appear in the 31st IEEE Canadian conference on electrical and computer engineering (CCECE 2018)
Negin F, Bremond F (2016) Human action recognition in videos: a survey. Tech. rep., INRIA Technical Report
Niebles JC, Wang H, Fei-Fei L (2008) Unsupervised learning of human action categories using spatial-temporal words. Int J Comput Vis 79(3):299–318
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175
Pascal F, Bombrun L, Tourneret JY, Berthoumieu Y (2013) Parameter estimation for multivariate generalized gaussian distributions. IEEE Trans Signal Process 61(23):5960–5971
Peters C, Hermann T, Wachsmuth S, Hoey J (2014) Automatic task assistance for people with cognitive disabilities in brushing teeth-a user study with the tebra system. ACM Transactions on Accessible Computing (TACCESS) 5(4):10
Sailaja V, Srinivasa Rao K, Reddy K (2010) Text independent speaker identification with finite multivariate generalized gaussian mixture model and hierarchical clustering algorithm. Int J Comput Appl 11(11):0975–8887
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local svm approach. In: Proceedings of the 17th international conference on pattern recognition, 2004. ICPR 2004, vol 3. IEEE, pp 32– 36
Scovanner P, Ali S, Shah M (2007) A 3-dimensional sift descriptor and its application to action recognition. In: Proceedings of the 15th ACM international conference on multimedia. ACM, pp 357– 360
Tanisik G, Zalluhoglu C, Ikizler-Cinbis N (2016) Facial descriptors for human interaction recognition in still images. Pattern Recogn Lett 73:44–51
Varanasi MK, Aazhang B (1989) Parametric generalized gaussian density estimation. J Acoust Soc Am 86(4):1404–1415
Vrigkas M, Nikou C, Kakadiaris IA (2015) A review of human activity recognition methods. Frontiers in Robotics and AI 2:28
Wallace CS (2005) Statistical and inductive inference by minimum message length. Springer, Berlin
Yang Y, Saleemi I, Shah M (2013) Discovering motion primitives for unsupervised grouping and one-shot learning of human actions, gestures, and expressions. IEEE Trans Pattern Anal Mach Intell 35(7):1635–1648
Yao B, Fei-Fei L (2012) Action recognition with exemplar based 2.5 d graph matching. In: European conference on computer vision. Springer, Berlin, pp 173–186
Yao B, Jiang X, Khosla A, Lin AL, Guibas L, Fei-Fei L (2011) Human action recognition by learning bases of action attributes and parts. In: 2011 IEEE international conference on computer vision (ICCV). IEEE, pp 1331–1338
Zheng Y, Zhang YJ, Li X, Liu BD (2012) Action recognition in still images using a combination of human pose and context information. In: 2012 19th IEEE international conference on image processing (ICIP). IEEE, pp 785–788
Zhou B, Lapedriza A, Xiao J, Torralba A, Oliva A (2014) Learning deep features for scene recognition using places database. In: Advances in neural information processing systems, pp 487–495
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
We need to express the Fisher information matrix into the differential forms dΣr, s , with Σr, s is the (r, s)-th non-redundant (i.e. r ≤ s) element of Σ. Introducing, for all r and s , the matrix E(r, s) : by
where \(\bar {E}(r,s)\) denotes the d × d matrix with the (r, s)-th entry 1 and 0 elsewhere,
Rights and permissions
About this article
Cite this article
Najar, F., Bourouis, S., Bouguila, N. et al. Unsupervised learning of finite full covariance multivariate generalized Gaussian mixture models for human activity recognition. Multimed Tools Appl 78, 18669–18691 (2019). https://doi.org/10.1007/s11042-018-7116-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-7116-9