Abstract
In this paper, a novel camera motion descriptor is proposed for video shot classification. In the proposed method, raw motion information of consecutive video frames are extracted by computing the motion vector of each macroblock to form motion vector fields (MVFs). Next, a motion consistency analysis is applied on MVFs to eliminate the inconsistent motion vectors. Then, MVFs are divided into nine (3 × 3) local regions and the singular value decomposition (SVD) technique is applied on the motion vectors extracted from each local region in the temporal direction. Consistent motion vectors of a number of MVFs are compactly represented at a time to characterize temporal camera motion. Accordingly, each local region of the whole video shot is represented using a sequence of compactly represented vectors. Finally, the sequence of vectors is converted into a histogram to describe the camera motions of each local region. Combination of all the local histograms is considered as the camera motion descriptor of a video shot. The shot descriptors are used in a classifier to classify video shots. In this work, we use support vector machine (SVM) for performing classification tasks. The experimental results show that the proposed camera motion descriptor has strong discriminative capability to classify different camera motion patterns in professionally captured video shots effectively. We also show that our proposed approach outperforms two state-of-the-art video shot classification methods.
Similar content being viewed by others
References
Abdollahian G, Taskiran CM, Pizlo Z, Delp EJ (2010) Camera motion-based analysis of user generated video. IEEE Trans Multimed 12(1):28– 41
Almeida J, Minetto R, Almeida TA, Torres R da S, Leite NJ (2009) Robust estimation of camera motion using optical flow models. Advances in visual computing. Springer, Berlin Heidelberg New York, pp 435– 446
Ballan L, Bertini M, Del Bimbo A, Seidenari L, Serra G (2011) Event detection and recognition for semantic annotation of video. Multimed Tools Appl 51(1):279–302
Bhattacharya S, Mehran R, Shah M, et al. (2014) Classification of cinematographic shots using lie algebra and its application to complex event recognition. IEEE Trans Multimed
Black MJ, Yacoob Y, Jepson AD, Fleet DJ (1997) Learning parameterized models of image motion. In: Proceedings of IEEE computer society conference on Computer vision and pattern recognition, pp 561 – 567
Bouthemy P, Gelgon M, Ganansia F (1999) A unified approach to shot change detection and camera motion characterization. IEEE Trans Circ Syst Video Technol 9(7):1030–1044
Canini L, Benini S, Leonardi R (2013) Classifying cinematographic shot types. Multimed Tools Appl 62(1):51–73
Chang S, Chen W, Meng HJ, Sundaram H, Zhong D (1998) A fully automated content-based video search engine supporting spatiotemporal queries. IEEE Trans Circ Syst Video Technol 8(5):602–615
Duan L-Y, Jin JS, Tian Q, Xu C-S (2006) Nonparametric motion characterization for robust classification of camera motion patterns. IEEE Trans Multimed 8(2):323–340
Duric Z, Rosenfeld A (1996) Image sequence stabilization in real time. Real-Time Imaging 2(5):271–284
Ewerth R, Schwalb M, Tessmann P, Freisleben B (2004) Estimation of arbitrary camera motion in MPEG videos. In: Proceedings of the 17th international conference on Pattern Recognition. ICPR’04, vol 1, pp 512–515
Fablet R, Bouthemy P, Perez P (2002) Nonparametric motion characterization using causal probabilistic models for video indexing and retrieval. IEEE Trans Image Process 11(4):393–407
Friedman J (1996) Another approach to polychotomous classifcation. Technical report, Stanford University, Department of Statistics
Hsu C-W, Lin C-J (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2):415–425
Hu W, Xie N, Li L, Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. IEEE Trans Syst Man Cybern Part C Appl Rev 41(6):797–819
Irani M, Anandan P (1998) Video indexing based on mosaic representations. Proc IEEE 86(5):905–921
Jain AK, Vailaya A, Wei X (1999) Query by video clip. Multimed Syst 7(5):369–384
Jeannin S, Divakaran A (2001) Mpeg-7 visual motion descriptors. IEEE Trans Circ Syst Video Technol 11(6):720–724
Jin R, Qi Y, Hauptmann A (2002) A probabilistic model for camera zoom detection. In: Proceedings of 16th international conference on Pattern recognition, vol 3, pp 859– 862
Kim J-G, Chang HS, Kim J, Kim H-M (2000) Efficient camera motion characterization for mpeg video indexing. In: IEEE international conference on Multimedia and expo. ICME, vol 2, pp 1171– 1174
Lan D-J, Ma Y-F, Zhang H-J (2003) A systemic framework of camera motion analysis for home video. In: Proceedings of international conference on Image processing. ICIP’03, vol 1, pp I–289–92
Lee S, Hayes III MH (2002) Real-time camera motion classification for content-based indexing and retrieval using templates. In: IEEE International conference on Acoustics, speech, and signal processing. ICASSP, vol 4, pp IV– 3664– IV–3667
Ma Y-F, Hua X-S, Lu L, Zhang H-J (2005) A generic framework of user attention model and its application in video summarization. IEEE Trans Multimed 7(5):907–919
Ma S, Wang W (2010) Effective camera motion analysis approach. In: IEEE international conference on Networking, sensing and control. ICNSC, pp 111–116
Ma Y-F, Zhang HJ (2003) Contrast-based image attention analysis by using fuzzy growing. ACM Multimedia’03.pp. 374–381
Minetto R, Leite NJ, Stolfi J (2007) Reliable detection of camera motion based on weighted optical flow fitting. Visapp (2), pp.435–440
Ngo C-W, Pong T-C, Zhang H-J (2002) Motion-based video representation for scene change detection. Int J Comput Vis 50(2):127–142
Ngo C-W, Pong T-C, Zhang H-J (2003) Motion analysis and segmentation through spatio-temporal slices processing. IEEE Trans Image Process 12(3):341–355
Nguyen N-T, Laurendeau D, Branzan-Albu A (2010) A robust method for camera motion estimation in movies based on optical flow. Int J Intell Syst Technol Appl 9(3):228–238
Oh J, Sankuratri P (2002) Automatic distinction of camera and object motions in video sequences. In: Proceeding of IEEE International conference on Multimedia and Expo. ICME ’02, vol 1, pp 81–84
Sun X, Divakaran A, Manjunath BS (2001) A motion activity descriptor and its extraction in compressed domain. In: Proceedings of the second IEEE pacific rim conference on Multimedia: advances in multimedia information processing. PCM ’01. Springer, London, pp 450– 457
Tan Y-P, Saur DD, Kulkami SR, Ramadge PJ (2000) Rapid estimation of camera motion from compressed video with application to video annotation. IEEE Trans Circ Syst Video Technol 10(1):133–146
Tok M, Glantz A, Krutz A, Sikora T (2013) Monte-carlo-based parametric motion estimation using a hybrid model approach. IEEE Trans Circ Syst Video Technol 23(4):607–620
Vapnik VN (1995) The nature of statistical learning theory. Springer-Verlag, New York
Wiegand T, Sullivan GJ, Bjontegaard G, Luthra A (2003) Overview of the h.264/avc video coding standard. IEEE Trans Circ Syst Video Technol 13(7):560–576
Yao Y-S, Chellappa R (1995) Electronic stabilization and feature tracking in long image sequences. DTIC Document
Acknowledgments
We would like to thank the reviewers for the valuable comments. This work is partly supported by a UTS International Research Scholarship, and the Specialized Research Fund for the Doctoral Program of Higher Education of China (20120041120050).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hasan, M.A., Xu, M., He, X. et al. A camera motion histogram descriptor for video shot classification. Multimed Tools Appl 74, 11073–11098 (2015). https://doi.org/10.1007/s11042-014-2218-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-014-2218-5