Skip to main content
Log in

A camera motion histogram descriptor for video shot classification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, a novel camera motion descriptor is proposed for video shot classification. In the proposed method, raw motion information of consecutive video frames are extracted by computing the motion vector of each macroblock to form motion vector fields (MVFs). Next, a motion consistency analysis is applied on MVFs to eliminate the inconsistent motion vectors. Then, MVFs are divided into nine (3 × 3) local regions and the singular value decomposition (SVD) technique is applied on the motion vectors extracted from each local region in the temporal direction. Consistent motion vectors of a number of MVFs are compactly represented at a time to characterize temporal camera motion. Accordingly, each local region of the whole video shot is represented using a sequence of compactly represented vectors. Finally, the sequence of vectors is converted into a histogram to describe the camera motions of each local region. Combination of all the local histograms is considered as the camera motion descriptor of a video shot. The shot descriptors are used in a classifier to classify video shots. In this work, we use support vector machine (SVM) for performing classification tasks. The experimental results show that the proposed camera motion descriptor has strong discriminative capability to classify different camera motion patterns in professionally captured video shots effectively. We also show that our proposed approach outperforms two state-of-the-art video shot classification methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Abdollahian G, Taskiran CM, Pizlo Z, Delp EJ (2010) Camera motion-based analysis of user generated video. IEEE Trans Multimed 12(1):28– 41

    Article  Google Scholar 

  2. Almeida J, Minetto R, Almeida TA, Torres R da S, Leite NJ (2009) Robust estimation of camera motion using optical flow models. Advances in visual computing. Springer, Berlin Heidelberg New York, pp 435– 446

  3. Ballan L, Bertini M, Del Bimbo A, Seidenari L, Serra G (2011) Event detection and recognition for semantic annotation of video. Multimed Tools Appl 51(1):279–302

    Article  Google Scholar 

  4. Bhattacharya S, Mehran R, Shah M, et al. (2014) Classification of cinematographic shots using lie algebra and its application to complex event recognition. IEEE Trans Multimed

  5. Black MJ, Yacoob Y, Jepson AD, Fleet DJ (1997) Learning parameterized models of image motion. In: Proceedings of IEEE computer society conference on Computer vision and pattern recognition, pp 561 – 567

  6. Bouthemy P, Gelgon M, Ganansia F (1999) A unified approach to shot change detection and camera motion characterization. IEEE Trans Circ Syst Video Technol 9(7):1030–1044

    Article  Google Scholar 

  7. Canini L, Benini S, Leonardi R (2013) Classifying cinematographic shot types. Multimed Tools Appl 62(1):51–73

    Article  Google Scholar 

  8. Chang S, Chen W, Meng HJ, Sundaram H, Zhong D (1998) A fully automated content-based video search engine supporting spatiotemporal queries. IEEE Trans Circ Syst Video Technol 8(5):602–615

    Article  Google Scholar 

  9. Duan L-Y, Jin JS, Tian Q, Xu C-S (2006) Nonparametric motion characterization for robust classification of camera motion patterns. IEEE Trans Multimed 8(2):323–340

    Article  Google Scholar 

  10. Duric Z, Rosenfeld A (1996) Image sequence stabilization in real time. Real-Time Imaging 2(5):271–284

    Article  Google Scholar 

  11. Ewerth R, Schwalb M, Tessmann P, Freisleben B (2004) Estimation of arbitrary camera motion in MPEG videos. In: Proceedings of the 17th international conference on Pattern Recognition. ICPR’04, vol 1, pp 512–515

  12. Fablet R, Bouthemy P, Perez P (2002) Nonparametric motion characterization using causal probabilistic models for video indexing and retrieval. IEEE Trans Image Process 11(4):393–407

    Article  Google Scholar 

  13. Friedman J (1996) Another approach to polychotomous classifcation. Technical report, Stanford University, Department of Statistics

  14. Hsu C-W, Lin C-J (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2):415–425

    Article  Google Scholar 

  15. Hu W, Xie N, Li L, Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. IEEE Trans Syst Man Cybern Part C Appl Rev 41(6):797–819

    Article  Google Scholar 

  16. Irani M, Anandan P (1998) Video indexing based on mosaic representations. Proc IEEE 86(5):905–921

    Article  Google Scholar 

  17. Jain AK, Vailaya A, Wei X (1999) Query by video clip. Multimed Syst 7(5):369–384

    Article  Google Scholar 

  18. Jeannin S, Divakaran A (2001) Mpeg-7 visual motion descriptors. IEEE Trans Circ Syst Video Technol 11(6):720–724

    Article  Google Scholar 

  19. Jin R, Qi Y, Hauptmann A (2002) A probabilistic model for camera zoom detection. In: Proceedings of 16th international conference on Pattern recognition, vol 3, pp 859– 862

  20. Kim J-G, Chang HS, Kim J, Kim H-M (2000) Efficient camera motion characterization for mpeg video indexing. In: IEEE international conference on Multimedia and expo. ICME, vol 2, pp 1171– 1174

  21. Lan D-J, Ma Y-F, Zhang H-J (2003) A systemic framework of camera motion analysis for home video. In: Proceedings of international conference on Image processing. ICIP’03, vol 1, pp I–289–92

  22. Lee S, Hayes III MH (2002) Real-time camera motion classification for content-based indexing and retrieval using templates. In: IEEE International conference on Acoustics, speech, and signal processing. ICASSP, vol 4, pp IV– 3664– IV–3667

  23. Ma Y-F, Hua X-S, Lu L, Zhang H-J (2005) A generic framework of user attention model and its application in video summarization. IEEE Trans Multimed 7(5):907–919

    Article  Google Scholar 

  24. Ma S, Wang W (2010) Effective camera motion analysis approach. In: IEEE international conference on Networking, sensing and control. ICNSC, pp 111–116

  25. Ma Y-F, Zhang HJ (2003) Contrast-based image attention analysis by using fuzzy growing. ACM Multimedia’03.pp. 374–381

  26. Minetto R, Leite NJ, Stolfi J (2007) Reliable detection of camera motion based on weighted optical flow fitting. Visapp (2), pp.435–440

  27. Ngo C-W, Pong T-C, Zhang H-J (2002) Motion-based video representation for scene change detection. Int J Comput Vis 50(2):127–142

    Article  MATH  Google Scholar 

  28. Ngo C-W, Pong T-C, Zhang H-J (2003) Motion analysis and segmentation through spatio-temporal slices processing. IEEE Trans Image Process 12(3):341–355

    Article  Google Scholar 

  29. Nguyen N-T, Laurendeau D, Branzan-Albu A (2010) A robust method for camera motion estimation in movies based on optical flow. Int J Intell Syst Technol Appl 9(3):228–238

    Google Scholar 

  30. Oh J, Sankuratri P (2002) Automatic distinction of camera and object motions in video sequences. In: Proceeding of IEEE International conference on Multimedia and Expo. ICME ’02, vol 1, pp 81–84

  31. Sun X, Divakaran A, Manjunath BS (2001) A motion activity descriptor and its extraction in compressed domain. In: Proceedings of the second IEEE pacific rim conference on Multimedia: advances in multimedia information processing. PCM ’01. Springer, London, pp 450– 457

  32. Tan Y-P, Saur DD, Kulkami SR, Ramadge PJ (2000) Rapid estimation of camera motion from compressed video with application to video annotation. IEEE Trans Circ Syst Video Technol 10(1):133–146

    Article  Google Scholar 

  33. Tok M, Glantz A, Krutz A, Sikora T (2013) Monte-carlo-based parametric motion estimation using a hybrid model approach. IEEE Trans Circ Syst Video Technol 23(4):607–620

    Article  Google Scholar 

  34. Vapnik VN (1995) The nature of statistical learning theory. Springer-Verlag, New York

    Book  MATH  Google Scholar 

  35. Wiegand T, Sullivan GJ, Bjontegaard G, Luthra A (2003) Overview of the h.264/avc video coding standard. IEEE Trans Circ Syst Video Technol 13(7):560–576

    Article  Google Scholar 

  36. Yao Y-S, Chellappa R (1995) Electronic stabilization and feature tracking in long image sequences. DTIC Document

Download references

Acknowledgments

We would like to thank the reviewers for the valuable comments. This work is partly supported by a UTS International Research Scholarship, and the Specialized Research Fund for the Doctoral Program of Higher Education of China (20120041120050).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiangjian He.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hasan, M.A., Xu, M., He, X. et al. A camera motion histogram descriptor for video shot classification. Multimed Tools Appl 74, 11073–11098 (2015). https://doi.org/10.1007/s11042-014-2218-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-014-2218-5

Keywords

Navigation