Skip to main content

Advertisement

Log in

Compressed domain human action recognition in H.264/AVC video streams

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper discusses a novel high-speed approach for human action recognition in H.264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors extracted from the compressed video sequence for feature extraction and further classification using Support Vector Machines (SVM). The ultimate goal of the proposed work is to portray a much faster algorithm than pixel domain counterparts, with comparable accuracy, utilizing only the sparse information from compressed video. Partial decoding rules out the complexity of full decoding, and minimizes computational load and memory usage, which can result in reduced hardware utilization and faster recognition results. The proposed approach can handle illumination changes, scale, and appearance variations, and is robust to outdoor as well as indoor testing scenarios. We have evaluated the performance of the proposed method on two benchmark action datasets and achieved more than 85 % accuracy. The proposed algorithm classifies actions with speed ( > 2,000 fps) approximately 100 times faster than existing state-of-the-art pixel-domain algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Amiri SM, Nasiopoulos P, Leung, VCM (2012) Non-negative sparse coding for human action recognition. Proceedings of the IEEE International Conference on Image Processing

  2. Babu RV, Anantharaman B, Ramakrishnan KR, Srinivasan SH (2002) Compressed domain action classification using HMM. Pattern Recogn Lett 23:1203–1213

  3. Babu RV, Ramakrishnan KR (2004) Recognition of human actions using motion history information extracted from the compressed video. Image Vis Comput 22(8):597–607

    Article  Google Scholar 

  4. Biswas S, Babu RV (2013) H.264 compressed video classification using histogram of oriented motion vectors (HOMV). In: Proceeding of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2040–2044

  5. Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In: Proceedings of the Tenth International Conference on Computer Vision

  6. Bobick AF, Davis JW (2001) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):257–267

    Article  Google Scholar 

  7. Chang CC, Lin CJ (2011) LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol 2 (27):1–27

    Article  Google Scholar 

  8. Chuohao Y, Ahammad P, Ramchandran K, Sastry SS (2008) High-speed action recognition and localization in compressed domain videos. IEEE Trans Circ Syst Video Technol 18(8):1006–1015

    Article  Google Scholar 

  9. Efros AA, Berg AC, Mori G, Malik J (2003) Recognizing action at a distance. Proc Int Conf Comp Vision 2:726–733

    Article  Google Scholar 

  10. Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253

    Article  Google Scholar 

  11. http://www.axis.com/products/video/about-networkvideo/compression-formats.htm

  12. Joint model H.264/AVC reference software. http://iphome.hhi.de/suehring/tml/

  13. Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T HMDB: a large video database for human motion recognition. In: Proceedings of the International Conference on Computer Vision (ICCV)

  14. Laptev I (2005) On space-time interest points. Int J Comput Vis 64(2/3):107–123

    Article  Google Scholar 

  15. Li Z, Fu Y, Huang T, Yan S (2008) Real-time human action recognition by luminance field trajectory analysis. In: Proceedings of the 16th ACM International conference on Multimedia

  16. Lin CA, Lin YY, Liao HYM, Jeng SK (2012) Action recognition using instance-specific and class-consistent cues. In: Proceedings of the IEEE International Conference on Image Processing

  17. Liu C, Yuen PC (2010) Human action recognition using boosted eigenactions. Image Vis Comput 28(5):825–835

    Article  Google Scholar 

  18. Ozer B, Wolf W, Akansu AN (2000) Human activity detection in MPEG sequences. In: Proceedings of the Workshop on Human Motion

  19. Poppe R (2010) A survey on vision-based human action recognition. Int J Comput Vis 28(2/3):976–990

    Article  Google Scholar 

  20. Sadek S, Al-Hamadi A, Michaelis B, Sayed U (2012) A fast statistical approach for human activity recognition. Int J Intell Sci 2(1):9–15

    Article  Google Scholar 

  21. Schldt C, Laptev I, Caputo B (2004) Recognizing human actions: A local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition

  22. Soomro K, Zamir AR, Shah M (2012) UCF101: A dataset of 101 human actions classes from videos in the wild. CoRR abs/1212.0402

  23. Sullivan G, Ohm J, Han WJ, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circ Syst Video Technol 22(12):1649–1668

    Article  Google Scholar 

  24. Tom M, Babu RV (2013) Fast moving-object detection in H.264/AVC compressed domain for video surveillance. In: Proceedings of the National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics

  25. Wang H, Ullah MM, Klser A, Laptev I, Schmid C (2009) Evaluation of local spatio-temporal features for action recognition. In: Proceedings of British Machine Vision Conference

  26. Weinland D, Ronfard R, Boyer E (2011) A survey of vision-based methods for action representation, segmentation and recognition. Computer Vision and Image Understanding 115(2):224–241

  27. Wiegand T, Sullivan GJ, Bjontegaard G, Luthra A (2003) Overview of the H.264/AVC video coding standard. IEEE Trans Circ Syst Video Technol 13(7):560–576

  28. Wu B, Yuan C, Hu W (2012) Human action recognition based on a heat kernel structural descriptor. In: Proceedings of the IEEE International Conference on Image Processing

  29. x264 reference software. http://www.videolan.org/developers/x264.html

  30. Yu TH, Kim TK, Cipolla R (2010) Real-time action recognition by spatiotemporal semantic and structural forests. In: British Machine Vision Conference

  31. Zhang X, Miao Z, Wan L (2012) Human action categories using motion descriptors. In: Proceedings of the IEEE International Conference on Image Processing

Download references

Acknowledgments

This work was supported by CARS (CARS-25) project from Centre for Artificial Intelligence and Robotics, Defence Research and Development Organization (DRDO), Govt. of India.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Venkatesh Babu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tom, M., Babu, R.V. & Praveen, R.G. Compressed domain human action recognition in H.264/AVC video streams. Multimed Tools Appl 74, 9323–9338 (2015). https://doi.org/10.1007/s11042-014-2083-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-014-2083-2

Keywords

Navigation