Advertisement

Multimedia Tools and Applications

, Volume 75, Issue 2, pp 1043–1078 | Cite as

A survey on compressed domain video analysis techniques

  • R. Venkatesh Babu
  • Manu Tom
  • Paras Wadekar
Article

Abstract

Image and video analysis requires rich features that can characterize various aspects of visual information. These rich features are typically extracted from the pixel values of the images and videos, which require huge amount of computation and seldom useful for real-time analysis. On the contrary, the compressed domain analysis offers relevant information pertaining to the visual content in the form of transform coefficients, motion vectors, quantization steps, coded block patterns with minimal computational burden. The quantum of work done in compressed domain is relatively much less compared to pixel domain. This paper aims to survey various video analysis efforts published during the last decade across the spectrum of video compression standards. In this survey, we have included only the analysis part, excluding the processing aspect of compressed domain. This analysis spans through various computer vision applications such as moving object segmentation, human action recognition, indexing, retrieval, face detection, video classification and object tracking in compressed videos.

Keywords

Video object segmentation Human action recognition Indexing Retrieval Face detection Video classification Object tracking Object localization Moving object detection H.264/AVC HEVC MPEG Compressed domain Quantization parameter Motion vectors Transform coefficients Video analysis 

Notes

Acknowledgments

This work was supported by CARS (CARS-25) project from Centre for Artificial Intelligence and Robotics, Defence Research and Development Organization (DRDO), Govt. of India. The authors wish to express grateful thanks to the referees for their useful comments and suggestions to improve the presentation of this paper.

References

  1. 1.
    Achanta R, Kankanhalli M, Mulhem P (2002) Compressed domain object tracking for automatic indexing of objects in MPEG home video. In: IEEE international conference on multimedia and expo, vol 2, pp 61–64Google Scholar
  2. 2.
    Ali S, Shah M (2007) A Lagrangian particle dynamics approach for crowd flow segmentation and stability analysis. In: IEEE conference on computer vision and pattern recognition (CVPR), 2007, pp 1–6. doi: 10.1109/CVPR.2007.382977
  3. 3.
    Babu RV, Anantharaman B, Ramakrishnan KR, Srinivasan SH (2002) Compressed domain action classification using HMM. Pattern Recog Lett 23(10):1203–1213Google Scholar
  4. 4.
    Babu RV, Ramakrishnan K (2007) Compressed domain video retrieval using object and global motion descriptors. Multimed Tools Appl 32(1):93–113CrossRefGoogle Scholar
  5. 5.
    Babu RV, Ramakrishnan KR (2004) Recognition of human actions using motion history information extracted from the compressed video. Image Vis Comput 22(8):597–607CrossRefGoogle Scholar
  6. 6.
    Babu RV, Ramakrishnan KR, Srinivasan SH (2004) Video object segmentation: a compressed domain approach. IEEE Trans Circ Syst Video Technol 14(4):462–474CrossRefGoogle Scholar
  7. 7.
    Benzougar A, Bouthemy P, Fablet R (2001) MRF-based moving object detection from MPEG coded video. In: IEEE international conference on image processing, vol 3, pp 402–405Google Scholar
  8. 8.
    Bhaskaran V, Konstantinides K (1995) Image and video compression standards: algorithms and architectures. Kluwer Academic PublishersGoogle Scholar
  9. 9.
    Biswas S, Babu R V (2013) H.264 compressed video classification using Histogram of Oriented Motion Vectors (HOMV). In: IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 2040–2044Google Scholar
  10. 10.
    Biswas S, Babu RV (2013) Real-time anomaly detection in H.264 compressed videos. In: National conference on computer vision, pattern recognition, image processing and graphics (NCVPRIPG) pp 1–4. doi: 10.1109/NCVPRIPG.2013.6776164
  11. 11.
    Biswas S, Babu RV (2014) Anomaly detection in compressed H.264/AVC video. Multimed Tools Appl:1–17. doi: 10.1007/s11042-014-2219-4
  12. 12.
    Biswas S, Praveen RG, Babu RV (2014) Super-pixel based crowd flow segmentation in H.264 compressed videos. In: International conference on image processingGoogle Scholar
  13. 13.
    Bjontegaard G, Lillevold K (2002) Context adaptive VLC coding of ceofficients. ISO/IEC Joint Video Team C028Google Scholar
  14. 14.
    Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In: IEEE international conference on computer vision, pp 1395–1402Google Scholar
  15. 15.
    Chen W, Yang QX, Lin KW, Wang SY, Huang CL (2011) Human and car identification using motion vector in H.264 compressed video. In: Visual communications and image processing, pp 1–4. doi: 10.1109/VCIP.2011.6115985
  16. 16.
    Chen YM, Bajic I, Saeedi P (2011) Moving region segmentation from compressed video using global motion estimation and Markov random fields. IEEE Trans Multimed 13(3):421–431CrossRefGoogle Scholar
  17. 17.
    Chua TS, Zhao Y, Kankanhalli MS (2002) Detection of human faces in compressed domain for video stratification. Vis Comput 18(2):121–133Google Scholar
  18. 18.
    Davis J, Bobick A (2001) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):257–267Google Scholar
  19. 19.
    De Bruyne S, Poppe C, Verstockt S, Lambert P, Van De Walle R (2009) Estimating motion reliability to improve moving object detection in the H.264/AVC domain. In: IEEE international conference on multimedia and expo, pp 330–333Google Scholar
  20. 20.
    Dong L, Schwartz S (2006) DCT-based object tracking in compressed video. In: IEEE international conference on acoustics, speech and signal processing, vol 2, pp II–II. doi: 10.1109/ICASSP.2006.1660430
  21. 21.
    Dong L, Zoghlami I, Schwartz S (2006) Object tracking in compressed video with confidence measures. In: IEEE international conference on multimedia and expo, pp 753–756Google Scholar
  22. 22.
    Eng HL, Ma KK (1999) Motion trajectory extraction based on macroblock motion vectors for video indexing. In: International conference on image processing, vol 3, pp 284–288Google Scholar
  23. 23.
    Eng HL, Ma KK (2000) Spatiotemporal segmentation of moving video objects over MPEG compressed domain. In: IEEE international conference on multimedia and expo, vol 3, pp 1531–1534Google Scholar
  24. 24.
    Favalli L, Mecocci A, Moschetti F (2000) Object tracking for retrieval applications in MPEG-2. IEEE Trans Circ Syst Video Technol 10(3):427–432CrossRefGoogle Scholar
  25. 25.
    Fei W, Zhu S (2010) Mean shift clustering-based moving object segmentation in the H.264 compressed domain. IET Image Process 4 (1):11–18CrossRefMathSciNetGoogle Scholar
  26. 26.
    Gnana Praveen R, Babu R V (2014) Crowd flow segmentation based on motion vectors in H.264 compressed domain. In: 2014 IEEE international conference on electronics, computing and communication technologies (IEEE CONECCT), pp 1–5. doi: 10.1109/CONECCT.2014.6740330
  27. 27.
    Goyat Y, Chateau T, Malaterre L, Trassoudaine L (2006) Vehicle trajectories evaluation by static video sensors. In: Intelligent transportation systems conference, pp 864–869Google Scholar
  28. 28.
    Guo GD, Jain AK, Ma WY, Zhang HJ (2002) Learning similarity measure for natural image retrieval with relevance feedback. IEEE Trans Neural Netw 13(4):811–820CrossRefGoogle Scholar
  29. 29.
    Hong WD, Lee TH, Chang PC (2007) Real-time foreground segmentation for the moving camera based on H.264 video coding information. In: Future generation communication and networking, vol 1, pp 385–390Google Scholar
  30. 30.
    Ibrahim M, Rao S (2007) Motion analysis in compressed video - a hybrid approach. In: IEEE international workshop on motion and video computing, pp 17–17Google Scholar
  31. 31.
    ISO/IEC JTC1 11172-2: Information technology – Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s – Part 2: Video (MPEG-1) (1993)Google Scholar
  32. 32.
    ISO/IEC JTC1 13818-2Generic coding of moving pictures and associated audio information – Part 2: Video (MPEG-2) (1994)Google Scholar
  33. 33.
    ISO/IEC JTC1 14496-2: Coding of audio-visual objects – Part 2: Visual (MPEG-4 visual version 1) (1999)Google Scholar
  34. 34.
    ISO - International Organization for Standardization. http://www.iso.org/iso/home.html
  35. 35.
    ITU Telecommunication Standardization Sector. http://www.itu.int/en/ITU-T/Pages/default.aspx
  36. 36.
    ITU-T: Recommendation H.261, Video Codec for Audiovisual Services at px64 kbit/s, version 1 (Dec 1990), version 2 (March 1993)Google Scholar
  37. 37.
    Jamrozik M, Hayes M (2002) A compressed domain video object segmentation system. In: International conference on image processing, vol 1, pp 113–116Google Scholar
  38. 38.
    Kapotas S, Skodras A (2010) Moving object detection in the H.264 compressed domain. In: IEEE international conference on imaging systems and techniques, pp 325–328Google Scholar
  39. 39.
    Käs C, Nicolas H (2008) An Approach to trajectory estimation of moving objects in the H.264 compressed domain. In: Proceedings of the 3rd pacific rim symposium on advances in image and video technology, pp 318–329Google Scholar
  40. 40.
    Khatoonabadi S, Bajic I (2013) Video object tracking in the compressed domain using spatio-temporal Markov random fields. IEEE Trans Image Process 22(1):300–313CrossRefMathSciNetGoogle Scholar
  41. 41.
    Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) HMDB: a large video database for human motion recognition. In: Proceedings of the international conference on computer vision, pp 2556–2563Google Scholar
  42. 42.
    Lie WN, Chen RL (2001) Tracking moving objects in MPEG-compressed videos. In: IEEE international conference on multimedia and expo, pp 965–968Google Scholar
  43. 43.
    Liu Z, Lu Y, Zhang Z (2007) Real-time spatiotemporal segmentation of video objects in the H.264 compressed domain. J Vis Commun Image Represent 18(3):275–290CrossRefGoogle Scholar
  44. 44.
    Mak CM, Cham WK (2009) Real-time video object segmentation in H.264 compressed domain. IET Image Process 3(5):272–285CrossRefGoogle Scholar
  45. 45.
    Manjunath B, Ohm JR, Vasudevan V, Yamada A (2001) Color and texture descriptors. IEEE Transa Circ Syst Video Technol 11(6):703–715CrossRefGoogle Scholar
  46. 46.
    Marpe D, Schwarz H, Wiegand T (2003) Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard. IEEE Trans Circ Syst Video Technol 13(7):620–636CrossRefGoogle Scholar
  47. 47.
    Mehmood K, Mrak M, Calic J, Kondoz A (2009) Object tracking in surveillance videos using compressed domain features from scalable bit-streams. Signal Process Image Commun 24(10):814–824CrossRefGoogle Scholar
  48. 48.
    Mehrabi M, Zargari F, Ghanbari M (2012) Compressed domain content based retrieval using H.264 DC-pictures. MultimedTools Appl 60(2):443–453CrossRefGoogle Scholar
  49. 49.
    Mezaris V, Kompatsiaris I, Boulgouris N, Strintzis M (2004) Real-time compressed-domain spatiotemporal segmentation and ontologies for video indexing and retrieval. IEEE Trans Circ Syst Video Technol 14(5):606–621CrossRefGoogle Scholar
  50. 50.
    Mezaris V, Kompatsiaris I, Kokkinou E, Strintzis MG (2003) Real-time compressed-domain spatiotemporal video segmentation. IEEE Trans Circ Syst Video Technol 14(5):606–621Google Scholar
  51. 51.
    Mezaris V, Kompatsiaris I, Strintzis MG (2004) Compressed-domain object detection for video understanding. In: Workshop on image analysis for multimedia interactive services (WIAMIS)Google Scholar
  52. 52.
    Mitsumoto S, Yuasa H, Zen H (1998) Moving object detection from MPEG coded picture. In: MVA, pp 422–425Google Scholar
  53. 53.
    Niu C, Liu Y (2010) Moving object segmentation in the H.264 compressed domain. In: Zha H, Taniguchi Ri, Maybank S (eds) Asian conference on computer vision, pp 645–654Google Scholar
  54. 54.
    Ohm J, Sullivan G, Schwarz H, Tan TK, Wiegand T (2012) Comparison of the coding efficiency of video coding standards; including high efficiency video coding (HEVC). IEEE Trans Circ Syst Video Technol 22(12):1669–1684CrossRefGoogle Scholar
  55. 55.
    Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987CrossRefGoogle Scholar
  56. 56.
    Ozer B, Wolf W, Akansu A (2000) Human activity detection in MPEG sequences. In: Proceedings workshop on human motion, pp 61–66Google Scholar
  57. 57.
    Ozer I, Wolf W (2002) Real-time posture and activity recognition. In: Workshop on motion and video computing, pp 133–138Google Scholar
  58. 58.
    Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philos Mag 2 (6):559–572CrossRefGoogle Scholar
  59. 59.
    Pei W, Zhixia W (2010) Moving object segmentation in H.264/AVC compressed domain using ant colony algorithm. In: International conference on signal processing systems (ICSPS), vol 2, pp 716–719Google Scholar
  60. 60.
    Poppe C, Bruyne SD, Paridaens T, Lambert P, de Walle RV (2009) Moving object detection in the H.264/AVC compressed domain for video surveillance applications. J Vis Commun Image Represent 20(6):428–437CrossRefGoogle Scholar
  61. 61.
    Porikli F (2004) Real-time video object segmentation for MPEG encoded video sequences. SPIE conference on Real-Time Imaging, vol 5297, pp 195–203Google Scholar
  62. 62.
    Porikli F, Bashir F, Sun H (2010) Compressed domain video object segmentation. IEEE Trans Circ Syst Video Technol 20(1):2–14CrossRefGoogle Scholar
  63. 63.
    Qiya Z, Gaobo Y, Weiwei C, Zhaoyang Z (2007) A fast and accurate moving object extraction scheme in the MPEG compressed domain. In: International conference on image and graphics, pp 592–597Google Scholar
  64. 64.
    Rangarajan B, Babu RV (2014) Human action recognition in compressed domain using PBL-McRBFN approach. In: 2014 IEEE ninth international conference on intelligent sensors, sensor networks and information processing (ISSNIP), pp 1–6. doi: 10.1109/ISSNIP.2014.6827622
  65. 65.
    Richardson IEG (2003) H.264 and MPEG-4 video compression: video coding for next-generation multimedia. WileyGoogle Scholar
  66. 66.
    Rijkse K (1996) H.263: Video coding for low-bit-rate communication. IEEE Commun Mag 34(12):42–45CrossRefGoogle Scholar
  67. 67.
    Rodriguez-Benitez L, Moreno-Garcia J, Castro-Schez J, Albusac J, Jimenez-Linares L (2009) Automatic objects behaviour recognition from compressed video domain. Image Vis Comput 27(6):648–657CrossRefGoogle Scholar
  68. 68.
    Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: International conference on pattern recognition, pp 32–36Google Scholar
  69. 69.
    Shi YQ, Sun H (2008) Image and video compression for multimedia engineering: fundamentals, algorithms, and standards, 2nd edn. CRC Press, Inc., Boca RatonCrossRefGoogle Scholar
  70. 70.
    Solana-Cipres C, Fernandez-Escribano G, Rodriguez-Benitez L, Moreno-Garcia J, Jimenez-Linares L (2009) Real-time moving object segmentation in H.264 compressed domain based on approximate reasoning. Int J Approx Reas 51(1):99–114CrossRefGoogle Scholar
  71. 71.
    Soomro K, Zamir AR, Shah M (2012) UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv:abs/1212.0402
  72. 72.
    Sukmarg O, Rao KR (2000) Fast Object Detection and Segmentation in MPEG Compressed Domain. TENCON. Proceedings 3:364–368Google Scholar
  73. 73.
    Sullivan G, Ohm J, Han WJ, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circ Syst Video Technol 22(12):1649–1668CrossRefGoogle Scholar
  74. 74.
    Szczerba K, Forchhammer S, Stttrup-Andersen J, Eybye P (2009) Fast compressed domain motion detection in H.264 video streams for video surveillance applications. In: Proceedings, AVSS, pp 478–483Google Scholar
  75. 75.
    Tan YP, Saur D, Kulkarni S, Ramadge P (2000) Rapid estimation of camera motion from compressed video with application to video annotation. IEEE Trans Circ Syst Video Technol 10(1):133–146CrossRefGoogle Scholar
  76. 76.
    The Moving Picture Experts Group website. http://mpeg.chiariglione.org/
  77. 77.
    Thilak V, Creusere CD (2004) Tracking of extended size targets in H.264 compressed video using the probabilistic data association filter. In: EUSIPCO, pp 281–284Google Scholar
  78. 78.
    Tom M, Babu RV (2013) Fast moving-object detection in H.264/AVC compressed domain for video surveillance. In: National conference on computer vision, pattern recognition, image processing and graphics (NCVPRIPG). doi: 10.1109/NCVPRIPG.2013.6776202
  79. 79.
    Tom M, Babu RV, Praveen R (2014) Compressed domain human action recognition in H.264/AVC video streams. Multimed Tools Appl. doi: 10.1007/s11042-014-2083-2
  80. 80.
    Vacavant A, Robinault L, Miguet S, Poppe C, de Walle RV (2011) Adaptive background subtraction in H.264/AVC bitstreams based on macroblock sizes. In: VISAPP, pp 51–58Google Scholar
  81. 81.
    Verstockt S, De Bruyne S, Poppe C, Lambert P, Van De Walle R (2009) Multi-view object localization in H.264/AVC compressed domain. In: IEEE international conference on advanced video and signal based surveillance, pp 370–374Google Scholar
  82. 82.
    Wang FP, Chung WH, Ni GK, Chen IY, Kuo SY (2012) Moving object extraction using compressed domain features of H.264 INTRA frames. In: IEEE international conference on advanced video and signal-based surveillance, pp 258–263Google Scholar
  83. 83.
    Wang H, Chang SF (1997) A highly efficient system for automatic face region detection in MPEG video. IEEE Trans Circ Syst Video Technol 7(4):615–628CrossRefMathSciNetGoogle Scholar
  84. 84.
    Wang J, Patel N, Grosky WI, Fotouhi F (2009) Moving camera moving object segmentation in compressed video sequences. Int J Image Graph 9(4):609–627CrossRefGoogle Scholar
  85. 85.
    Wang R, Zhang H, Zhang Y (2000) A confidence measure based moving object extraction system built for compressed domain. In: Proceedings of the IEEE international symposium on circuits and systems, p 21–24Google Scholar
  86. 86.
    Wang T, Liang J, Wang X, Wang S (2012) Background modeling using local binary patterns of motion vector. In: IEEE conference on visual communications and image processing, pp 1–5. doi: 10.1109/VCIP.2012.6410784
  87. 87.
    Wang W, Yang L, Gao W (2008) Modeling background and segmenting moving objects from compressed video. IEEE Trans Circ Syst Video Technol 18(5):670–681CrossRefGoogle Scholar
  88. 88.
    Welcome to the IEC - International Electrotechnical Commission. http://www.iec.ch/
  89. 89.
    Wiegand T, Sullivan G, Bjontegaard G, Luthra A (2003) Overview of the H.264/AVC video coding standard. IEEE Trans Circ Syst Video Technol 13(7):560–576CrossRefGoogle Scholar
  90. 90.
    Yang J, Wang S, Lei Z, Zhao Y, Li S (2012) Spatio-temporal LBP based moving object segmentation in compressed domain. In: IEEE international conference on advanced video and signal-based surveillance (AVSS), pp 252–257Google Scholar
  91. 91.
    Yeo BL, Liu B (1995) Rapid scene analysis on compressed video. IEEE transactions on circuits and systems for video technology 5(6):533–544CrossRefGoogle Scholar
  92. 92.
    Yeo C, Ahammad P, Ramchandran K, Sastry S (2008) High-speed action recognition and localization in compressed domain videos. IEEE Trans Circ Syst Video Technol 18(8):1006–1015CrossRefGoogle Scholar
  93. 93.
    Yoneyama A, Nakajima Y, Yanagihara H, Sugano M (1999) Moving object detection and identification from MPEG coded data. In: International conference on image processing, vol 2, pp 934–938Google Scholar
  94. 94.
    You W, Sabirin MSH, Kim M (2007) Moving object tracking in H.264/AVC bitstream. In: MCAM, pp 483–492Google Scholar
  95. 95.
    You W, Sabirin MSH, Kim M (2012) Real-time detection and tracking of multiple objects with partial decoding in H.264/AVC bitstream domain. arXiv:abs/1202.4743
  96. 96.
    Yu DL (2003) Video analysis and indexing in compressed domain. Master Of Science Thesis, Institute for Infocomm Research, National University of SingaporeGoogle Scholar
  97. 97.
    Yu X, Xue P, Duan L, Tian Q (2007) An algorithm to estimate mean vehicle speed from MPEG Skycam video. Multimed Tools Appl 34(1):85–105CrossRefGoogle Scholar
  98. 98.
    Yu XD, Duan LY, Tian Q (2003) Robust moving video object segmentation in the MPEG compressed domain. In: IEEE international conference on image processing, vol 3. doi: 10.1109/ICIP.2003.1247399
  99. 99.
    Zeng W, Du J, Gao W, Huang Q (2005) Robust moving object segmentation on H.264/AVC compressed video using the block-based MRF model. Real-Time Imaging 11(4):290–299CrossRefGoogle Scholar
  100. 100.
    Zeng W, Gao W, Zhao D (2003) Automatic moving object extraction in MPEG video. In: Proceedings of the international symposium on circuits and systems, vol 2, pp 524–527Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Video Analytics Laboratory, SERC, Indian Institute of ScienceBangaloreIndia

Personalised recommendations