A unified model for human activity recognition using spatial distribution of gradients and difference of Gaussian kernel

Vishwakarma, Dinesh Kumar; Dhiman, Chhavi

doi:10.1007/s00371-018-1560-4

A unified model for human activity recognition using spatial distribution of gradients and difference of Gaussian kernel

Original Article
Published: 29 May 2018

Volume 35, pages 1595–1613, (2019)
Cite this article

The Visual Computer Aims and scope Submit manuscript

819 Accesses
48 Citations
Explore all metrics

Abstract

Understanding of human action and activity from video data is growing field and received rapid importance due to surveillance, security, entertainment and personal logging. In this work, a new hybrid technique is proposed for the description of human action and activity in video sequences. The unified framework endows a robust feature vector wrapping both global and local information strengthening discriminative depiction of action recognition. Initially, entropy-based texture segmentation is used for human silhouette extraction followed by construction of average energy silhouette images (AEIs). AEIs are the 2D binary projection of human silhouette frames of the video sequences, which reduces the feature vector generation time complexity. Spatial Distribution Gradients are computed at different levels of resolution of sub-images of AEI consisting overall shape variations of human silhouette during the activity. Due to scale, rotation and translation invariant properties of STIPs, the vocabulary of DoG-based STIPs are created using vector quantization which is unique for each class of the activity. Extensive experiments are conducted to validate the performance of the proposed approach on four standard benchmarks, i.e., Weizmann, KTH, Ballet Movements, Multi-view IXMAS. Promising results are obtained when compared with the similar state of the arts, demonstrating the robustness of the proposed hybrid feature vector for different types of challenges—illumination, view variations posed by the datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Robust Framework for the Recognition of Human Action and Activity Using Spatial Distribution Gradients and Gabor Wavelet

Integration of moment invariants and uniform local binary patterns for human activity recognition in video sequences

Article 16 November 2015

Human activity recognition algorithm in video sequences based on the fusion of multiple features for realistic and multi-view environment

Article 08 August 2023

References

Tripathi, G., Singh, K., Vishwakarma, D.K.: Convolutional neural networks for crowd behaviour analysis: a survey. Vis. Comput. 1–24 (2018). https://doi.org/10.1007/s00371-018-1499-5
Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28(6), 976–990 (2010)
Article Google Scholar
Vishwakarma, S., Agrawal, A.: A survey on activity recognition and behavior understanding in video surveillance. Vis. Comput. 29(10), 983–1009 (2013)
Article Google Scholar
Herath, S., Harandi, M., Porikli, F.: Going deeper into action recognition: a survey. Image Vis. Comput. 60, 4–21 (2017)
Article Google Scholar
Han, J., Zhu, J., Cui, Y., Bai, L., Yue, J.: Action detection by double hierarchical multi-structure space–time statistical matching model. Opt. Rev. 25(141), 1–15 (2018)
Google Scholar
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition (2018). arXiv:1801.07455
Weng, Z., Guan, Y.: Action recognition using length-variable edge trajectory and spatio-temporal motion skeleton descriptor. J. Image Video Process. 2018, 8 (2018). https://doi.org/10.1186/s13640-018-0250-5
Article Google Scholar
Vishwakarma, D.K., Kapoor, R., Maheshwari, R., Kapoor, V., Raman, S.: Recognition of abnormal human activity using the changes in orientation of silhouette in key frames. In: 2nd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi (2015)
Han, H., Li, X.J.: Human action recognition with sparse geometric features. Imaging Sci. J. 63, 45–53 (2015)
Article Google Scholar
Guo, H., Fan, X., Wang, S.: Human attribute recognition by refining attention heat map. Pattern Recogn. Lett. 94, 38–45 (2017)
Article Google Scholar
Takano, W., Yamada, Y., Nakamur, Y.: Generation of action description from classification of motion and object. Robot. Auton. Syst. 91, 247–257 (2017)
Article Google Scholar
Patrona, F., Chatzitofis, A., Zarpalas, D., Daras, P.: Motion analysis: action detection, recognition and evaluation based on motion capture data. Pattern Recogn. 76, 612–622 (2018)
Article Google Scholar
Wang, X., Qi, C., Lin, F.: Combined trajectories for action recognition based on saliency detection and motion boundary. Signal Process. Image Commun. 57, 91–102 (2017)
Article Google Scholar
Dawn, D.D., Shaikh, S.H.: A comprehensive survey of human action recognition. Vis. Comput. 32(3), 289–306 (2016)
Article Google Scholar
Vishwakarma, D., Singh, K.: Human activity recognition based on spatial distribution of gradients at sub-levels of average energy silhouette images. IEEE Trans. Cogn. Dev. Syst. 9(4), 316–327 (2017)
Article Google Scholar
Vishwakarma, D.K., Kapoor, R.: Hybrid classifier based human action recogntion using silhouettes and cells. Expert Syst. Appl. 42(20), 6957–6965 (2015)
Article Google Scholar
Al-Ali, S., Milanova, M., Lynn Fox, H.A.-R.: Human action recognition: contour-based and silhouette-based approaches. Comput. Vis. Control Syst. 2, 11–47 (2014)
Google Scholar
Jalal, A., Kim, Y.-H., Kim, Y.-J., Kim, D.: Robust human activity recognition from depth video using spatiotemporal multi-fused features. Pattern Recogn. 61, 295–308 (2017)
Article Google Scholar
Coniglio, C., Meurie, C., Lézoray, O., Berbineau, M.: People silhouette extraction from people detection bounding boxes in images. Pattern Recogn. Lett. 93, 182–191 (2017)
Article Google Scholar
Coniglio, C., Meurie, C., Lézoray, O., Berbineau, M.: A graph based people silhouette segmentation using combined probabilities extracted from appearance, shape template prior, and color distributions. In: International Conference on Advanced Concepts for Intelligent Vision Systems, Catania, Italy (2015)
Asadi-Aghbolaghi, M., Kasaei, S.: Supervised spatio-temporal kernel descriptor for human action recognition from RGB-depth videos. Multimed. Tools Appl. 1–21 (2017). https://doi.org/10.1007/s11042-017-5017-y
Al-Maadeed, S., Almotaeryi, R., Jiang, R., Bouridane, A.: Robust human silhouette extraction with Laplacian fitting. Pattern Recogn. Lett. 49, 69–76 (2014)
Article Google Scholar
Singh, S., Velastin, S., Ragheb, H., M.: A multicamera human action video dataset for the evaluation of action recognition methods. In: International Conference on Advanced Video and Signal Based Surveillance, Boston, Massachusetts (2010)
Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23(3), 257–267 (2001)
Article Google Scholar
Ijjina, E.P., Chalavadi, K.M.: Human action recognition in RGB-D videos using motion sequence. Pattern Recogn. 72, 504–516 (2017)
Article Google Scholar
Aggarwal, H., Vishwakarma, D.K.: Covariate conscious approach for Gait recognition based upon Zernike moment invariants. IEEE Trans. Auton. Ment. Dev. 99, 1–1 (2017)
Google Scholar
Laptev, I.: On space–time interest points. Int. J. Comput. Vis. 64(2–3), 107–123 (2005)
Article Google Scholar
Raptis, M., Soatto, S.: Tracklet descriptors for action modeling and video analysis. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV. Lecture Notes in Computer Science, vol. 6311. Springer, Berlin (2010)
Google Scholar
Pei, L., Ye, M., Zhao, X., Bao, Y.D.: Action recognition by learning temporal slowness invariant features. Vis. Comput. 32(11), 1395–1404 (2016)
Article Google Scholar
Nguyen, T.-N., Miyata, K.: Multi-scale region perpendicular local binary pattern: an effective feature for interest region description. Vis. Comput. 31(4), 391–406 (2015)
Article Google Scholar
Vishwakarma, D.K., Kapoor, R., Dhiman, A.: Unified framework for human activity recognition: an approach using spatial edge distribution and R-transform. Int. J. Electron. Commun. 70(3), 341–353 (2016)
Article Google Scholar
Brutzer, S., Höferlin, B., Heidemann, G.: Evaluation of background subtraction techniques for video surveillance. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA (2011)
Permuter, H., Francos, J., Jermyn, I.: A study of Gaussian mixture models of color and texture features for image classification and segmentation. Pattern Recogn. 39(4), 695–706 (2006)
Article MATH Google Scholar
Zeng, S., Huang, R., Kang, Z., Sang, N.: Image segmentation using spectral clustering of Gaussian mixture models. Neurocomputing 144, 346–356 (2014)
Article Google Scholar
Ojala, T., Pietikainen, M.: Unsupervised texture segmentation using feature distributions. Pattern Recogn. 32(3), 477–486 (1999)
Article Google Scholar
Heikkila, M., Pietikainen, M.: A texture-based method for modeling the background and detecting moving objects. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 657–662 (2006)
Article Google Scholar
Rampun, A., Strange, H., Zwiggelaar, R.: Texture segmentation using different orientations of GLCM features. In: International Conference on Computer Vision, Germany (2013)
Haralick, R.M., Shanmugam, K., Dinstein, I.H.: Textural features for image classification. IEEE Trans. Syst. Man Cybern. SMC(6), 610–621 (1973)
Article Google Scholar
Soh, L., Tsatsoulis, C.: Texture analysis of sar sea ice imagery using gray level co-occurrence matrices. IEEE Trans. Geosci. Remote Sens. 37(2), 780–795 (1999)
Article Google Scholar
Clausi, D.A.: An analysis of co-occurrence texture statistics as a function of grey level quantization. Can. J. Remote Sens. 28(1), 45–62 (2002)
Article Google Scholar
Komorkiewicz, M., Gorgon, M.: Foreground object features extraction with GLCM texture descriptor in FPGA. In: IEEE Conference on Design and Architectures for Signal and Image Processing (DASIP), Cagliari, Italy (2013)
Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, New York (2006)
MATH Google Scholar
Johnson, S.C.: Hierarichal clustering schemes. Pyschometrica 32(3), 241–254 (1967)
Article Google Scholar
Ng, A.Y., Jordan, A.I., Weiss, Y.: On spectral clustering : analysis and an algorithm. In: NIPS (2001)
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition, Cambridge, UK (2004)
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space–time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)
Article Google Scholar
Guha, T., Ward, R.K.: Learning sparse representations for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 34(8), 1576–1588 (2012)
Article Google Scholar
Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Comput. Vis. Image Underst. 104(2–3), 249–257 (2006)
Article Google Scholar
Rahmani, H., Mian, A.: 3D action recognition from novel viewpoints. In: CVPR, Las Vegas (2016)
CMU motion capture database. http://mocap.cs
Liu, L., Shao, L., Li, X., Lu, K.: Learning spatio-temporal representations for action recognition: a genetic programming approach. IEEE Trans. Cybern. 46(1), 158–170 (2016)
Article Google Scholar
Chaaraoui, A.A., Pérez, P.C., Revuelta, F.F.: Sihouette-based human action recognition using sequences of key poses. Pattern Recogn. Lett. 34(15), 1799–1807 (2013)
Article Google Scholar
Wu, D., Shao, L.: Silhouette analysis-based action recognition via exploiting human poses. IEEE Trans. Circuits Syst. Video Technol. 23(2), 236–243 (2013)
Article Google Scholar
Goudelis, G., Karpouzis, K., Kollias, S.: Exploring trace transform for robust human action recognition. Pattern Recogn. 46(12), 3238–3248 (2013)
Article Google Scholar
Touati, R., Mignotte, M.: MDS-based multi-axial dimensionality reduction model for human action recognition. In: Canadian Conference on Computer and Robot Vision, Montreal, QC, Canada (2014)
Fu, Y., Zhang, T., Wang, W.: Sparse coding-based space–time video representation for action recognition. Multimed. Tools Appl. 76(10), 12645–12658 (2017)
Article Google Scholar
Lei, J., Li, G., Zhang, J., Guo, Q., Tu, D.: Continuous action segmentation and recognition using hybrid convolutional neural network-hidden Markov model model. IET Comput. Vis. 10(6), 537–544 (2016)
Article Google Scholar
Liu, H., Shu, N., Tang, Q., Zhang, W.: Computational model based on neural network of visual cortex for human action recognition. IEEE Trans. Neural Netw. Learn. Syst. PP(99), 1–14 (2017)
Google Scholar
Sadek, S., Hamadi, A.A., Elmezain, M., Michaelis, B., Sayed, U.: Human action recognition via affine moment invariants. In: International Conference on Pattern Recognition, Tsukuba, Japan (2012)
Saghafi, B., Rajan, D.: Human action recognition using Pose-based discriminant embedding. Sig. Process. Image Commun. 27(1), 96–111 (2012)
Article Google Scholar
Rahman, S.A., Song, I., Leung, M.K.H., Lee, I., Lee, K.: Fast action recognition using negative space features. Expert Syst. Appl. 41(2), 574–587 (2014)
Article Google Scholar
Conde, I.G., Olivieri, D.N.: A KPCA spatio-temporal differential geometric trajectory cloud classifier for recognizing human actions in a CBVR system. Expert Syst. Appl. 42(13), 5472–5490 (2015)
Article Google Scholar
Li, B., Camps, O.I., Sznaier, M.: Cross-view activity recognition using hankelets. In: IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI (2012)
Shi, Y., Tian, Y., Wang, Y., Huang, T.: Sequential deep trajectory descriptor for action recognition with three-stream CNN. IEEE Trans. Multimed. 19(7), 1510–1520 (2017)
Article Google Scholar
Bregonzio, M., Gong, S., Xiang, T.: Recognising action as clouds of space time interest points. In: CVPR, FL, USA, Miami (2009)
Liu, J., Shah, M.: Learning human actions via information maximization. In: CVPR, Anchorage, AK, USA (2008)
Ryoo, M., Aggarwal, J.: Spatio-temporal relationship match: video structure comparison for recognition of complex human activities. In: ICCV, Kyoto, Japan (2009)
Dollár, P., Rabaud, V.C., Cottrell, G.W., Belongie, S.J.: Behavior recognition via sparse spatio-temporal features. In: International Conference on Computer Communications and Networks, Washington, USA (2005)
Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Anchorage, AK, USA (2008)
Wang, Y., Mori, G.: Human action recognition using semi-latent topic model. IEEE Trans. Pattern Anal. Mach. Intell. 31(10), 1762–1764 (2009)
Article Google Scholar
Ming, X.L., Xia, H.J., Zheng, T.L.: Human action recognition based on chaotic invariants. J. South Cent. Univ. 20, 3171–3179 (2014)
Google Scholar
Iosifidis, A., Tefas, A., Pitas, I.: Discriminant bag of words based representation for human action recognition. Pattern Recogn. Lett. 49, 185–192 (2014)
Article Google Scholar
Wu, X., Xu, D., Duan, L., Luo, J.: Action recognition using context and appearance distribution features. In: IEEE Conference on Computer Vision and Pattern Recognition CVPR, Providence, RI (2011)
Weinland, D., Özuysal, M., Fua, P.: Making action recognition robust to occlusions and viewpoint changes. In: Proceedings of the European Conference on Computer Vision (ECCV), Crete, Greece (2010)
Wu, X., Jia, Y.: View-invariant action recognition using latent kernelized structural SVM. In: Proceedings of the 12th European Conference on Computer Vision (ECCV), Florence, Italy (2012)
Mosabbeb, E.A., Raahemifar, K., Fathy, M.: Multi-view human activity recognition in distributed camera. Sensors 13(7), 8750–8770 (2013)
Article Google Scholar
Wang, J., Zheng, H., Gao, J., Cen, J.: Cross-view action recognition based on a statistical translation framework. IEEE Trans. Circuits Syst. Video Technol. 26(8), 1461–1475 (2016)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Technology, Delhi Technological University, New Delhi, 110042, India
Dinesh Kumar Vishwakarma
Department of Electronics and Communication Engineering, Delhi Technological University, New Delhi, 110042, India
Chhavi Dhiman

Authors

Dinesh Kumar Vishwakarma
View author publications
You can also search for this author in PubMed Google Scholar
Chhavi Dhiman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dinesh Kumar Vishwakarma.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vishwakarma, D.K., Dhiman, C. A unified model for human activity recognition using spatial distribution of gradients and difference of Gaussian kernel. Vis Comput 35, 1595–1613 (2019). https://doi.org/10.1007/s00371-018-1560-4

Download citation

Published: 29 May 2018
Issue Date: November 2019
DOI: https://doi.org/10.1007/s00371-018-1560-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A unified model for human activity recognition using spatial distribution of gradients and difference of Gaussian kernel

Abstract

Access this article

Similar content being viewed by others

A Robust Framework for the Recognition of Human Action and Activity Using Spatial Distribution Gradients and Gabor Wavelet

Integration of moment invariants and uniform local binary patterns for human activity recognition in video sequences

Human activity recognition algorithm in video sequences based on the fusion of multiple features for realistic and multi-view environment

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A unified model for human activity recognition using spatial distribution of gradients and difference of Gaussian kernel

Abstract

Access this article

Similar content being viewed by others

A Robust Framework for the Recognition of Human Action and Activity Using Spatial Distribution Gradients and Gabor Wavelet

Integration of moment invariants and uniform local binary patterns for human activity recognition in video sequences

Human activity recognition algorithm in video sequences based on the fusion of multiple features for realistic and multi-view environment

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation