Compressed domain human action recognition in H.264/AVC video streams

Tom, Manu; Babu, R. Venkatesh; Praveen, R Gnana

doi:10.1007/s11042-014-2083-2

Compressed domain human action recognition in H.264/AVC video streams

Published: 21 June 2014

Volume 74, pages 9323–9338, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Manu Tom¹,
R. Venkatesh Babu¹ &
R Gnana Praveen¹

734 Accesses
11 Citations
Explore all metrics

Abstract

This paper discusses a novel high-speed approach for human action recognition in H.264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors extracted from the compressed video sequence for feature extraction and further classification using Support Vector Machines (SVM). The ultimate goal of the proposed work is to portray a much faster algorithm than pixel domain counterparts, with comparable accuracy, utilizing only the sparse information from compressed video. Partial decoding rules out the complexity of full decoding, and minimizes computational load and memory usage, which can result in reduced hardware utilization and faster recognition results. The proposed approach can handle illumination changes, scale, and appearance variations, and is robust to outdoor as well as indoor testing scenarios. We have evaluated the performance of the proposed method on two benchmark action datasets and achieved more than 85 % accuracy. The proposed algorithm classifies actions with speed ( > 2,000 fps) approximately 100 times faster than existing state-of-the-art pixel-domain algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human Action Recognition and Prediction: A Survey

Article 28 March 2022

Human action recognition using fusion of multiview and deep features: an application to video surveillance

Article 14 March 2020

A survey on video-based Human Action Recognition: recent updates, datasets, challenges, and applications

Article 25 September 2020

References

Amiri SM, Nasiopoulos P, Leung, VCM (2012) Non-negative sparse coding for human action recognition. Proceedings of the IEEE International Conference on Image Processing
Babu RV, Anantharaman B, Ramakrishnan KR, Srinivasan SH (2002) Compressed domain action classification using HMM. Pattern Recogn Lett 23:1203–1213
Babu RV, Ramakrishnan KR (2004) Recognition of human actions using motion history information extracted from the compressed video. Image Vis Comput 22(8):597–607
Article Google Scholar
Biswas S, Babu RV (2013) H.264 compressed video classification using histogram of oriented motion vectors (HOMV). In: Proceeding of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2040–2044
Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In: Proceedings of the Tenth International Conference on Computer Vision
Bobick AF, Davis JW (2001) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):257–267
Article Google Scholar
Chang CC, Lin CJ (2011) LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol 2 (27):1–27
Article Google Scholar
Chuohao Y, Ahammad P, Ramchandran K, Sastry SS (2008) High-speed action recognition and localization in compressed domain videos. IEEE Trans Circ Syst Video Technol 18(8):1006–1015
Article Google Scholar
Efros AA, Berg AC, Mori G, Malik J (2003) Recognizing action at a distance. Proc Int Conf Comp Vision 2:726–733
Article Google Scholar
Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253
Article Google Scholar
http://www.axis.com/products/video/about-networkvideo/compression-formats.htm
Joint model H.264/AVC reference software. http://iphome.hhi.de/suehring/tml/
Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T HMDB: a large video database for human motion recognition. In: Proceedings of the International Conference on Computer Vision (ICCV)
Laptev I (2005) On space-time interest points. Int J Comput Vis 64(2/3):107–123
Article Google Scholar
Li Z, Fu Y, Huang T, Yan S (2008) Real-time human action recognition by luminance field trajectory analysis. In: Proceedings of the 16th ACM International conference on Multimedia
Lin CA, Lin YY, Liao HYM, Jeng SK (2012) Action recognition using instance-specific and class-consistent cues. In: Proceedings of the IEEE International Conference on Image Processing
Liu C, Yuen PC (2010) Human action recognition using boosted eigenactions. Image Vis Comput 28(5):825–835
Article Google Scholar
Ozer B, Wolf W, Akansu AN (2000) Human activity detection in MPEG sequences. In: Proceedings of the Workshop on Human Motion
Poppe R (2010) A survey on vision-based human action recognition. Int J Comput Vis 28(2/3):976–990
Article Google Scholar
Sadek S, Al-Hamadi A, Michaelis B, Sayed U (2012) A fast statistical approach for human activity recognition. Int J Intell Sci 2(1):9–15
Article Google Scholar
Schldt C, Laptev I, Caputo B (2004) Recognizing human actions: A local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition
Soomro K, Zamir AR, Shah M (2012) UCF101: A dataset of 101 human actions classes from videos in the wild. CoRR abs/1212.0402
Sullivan G, Ohm J, Han WJ, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circ Syst Video Technol 22(12):1649–1668
Article Google Scholar
Tom M, Babu RV (2013) Fast moving-object detection in H.264/AVC compressed domain for video surveillance. In: Proceedings of the National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics
Wang H, Ullah MM, Klser A, Laptev I, Schmid C (2009) Evaluation of local spatio-temporal features for action recognition. In: Proceedings of British Machine Vision Conference
Weinland D, Ronfard R, Boyer E (2011) A survey of vision-based methods for action representation, segmentation and recognition. Computer Vision and Image Understanding 115(2):224–241
Wiegand T, Sullivan GJ, Bjontegaard G, Luthra A (2003) Overview of the H.264/AVC video coding standard. IEEE Trans Circ Syst Video Technol 13(7):560–576
Wu B, Yuan C, Hu W (2012) Human action recognition based on a heat kernel structural descriptor. In: Proceedings of the IEEE International Conference on Image Processing
x264 reference software. http://www.videolan.org/developers/x264.html
Yu TH, Kim TK, Cipolla R (2010) Real-time action recognition by spatiotemporal semantic and structural forests. In: British Machine Vision Conference
Zhang X, Miao Z, Wan L (2012) Human action categories using motion descriptors. In: Proceedings of the IEEE International Conference on Image Processing

Download references

Acknowledgments

This work was supported by CARS (CARS-25) project from Centre for Artificial Intelligence and Robotics, Defence Research and Development Organization (DRDO), Govt. of India.

Author information

Authors and Affiliations

Video Analytics Lab, SERC, Indian Institute of Science, Bangalore, India
Manu Tom, R. Venkatesh Babu & R Gnana Praveen

Authors

Manu Tom
View author publications
You can also search for this author in PubMed Google Scholar
R. Venkatesh Babu
View author publications
You can also search for this author in PubMed Google Scholar
R Gnana Praveen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to R. Venkatesh Babu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tom, M., Babu, R.V. & Praveen, R.G. Compressed domain human action recognition in H.264/AVC video streams. Multimed Tools Appl 74, 9323–9338 (2015). https://doi.org/10.1007/s11042-014-2083-2

Download citation

Received: 29 August 2013
Revised: 09 March 2014
Accepted: 06 May 2014
Published: 21 June 2014
Issue Date: November 2015
DOI: https://doi.org/10.1007/s11042-014-2083-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Compressed domain human action recognition in H.264/AVC video streams

Abstract

Access this article

Similar content being viewed by others

Human Action Recognition and Prediction: A Survey

Human action recognition using fusion of multiview and deep features: an application to video surveillance

A survey on video-based Human Action Recognition: recent updates, datasets, challenges, and applications

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Compressed domain human action recognition in H.264/AVC video streams

Abstract

Access this article

Similar content being viewed by others

Human Action Recognition and Prediction: A Survey

Human action recognition using fusion of multiview and deep features: an application to video surveillance

A survey on video-based Human Action Recognition: recent updates, datasets, challenges, and applications

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation