Abstract
A compact framework is presented for the description and representation of videos depicting human activities, with the goal of enabling automated large-volume video summarization for semantically meaningful key-frame extraction. The framework is structured around the concept of per-frame visual word histograms, using the popular Bag-of-Features approach. Three existing image descriptors (histogram, FMoD, SURF) and a novel one (LMoD), as well as a component of an existing state-of-the-art activity descriptor (Dense Trajectories), are adapted into the proposed framework and quantitatively compared against each other, as well as against the most common video summarization descriptor (global image histogram), using a publicly available annotated dataset and the most prevalent video summarization method, i.e., frame clustering. In all cases, several image modalities are exploited (luminance, hue, edges, optical flow magnitude) in order to simultaneously capture information about the depicted shapes, colors, lighting, textures and motions. The quantitative evaluation results indicate that one of the proposed descriptors clearly outperforms the competing approaches in the context of the presented framework.
The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement number 316564 (IMPART).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Evans, A., Agenjo, J., Blat, J.: Combined 2D and 3D web-based visualisation of on-set big media data. In: IEEE International Conference on Image Processing (ICIP), pp. 1120–1124 (2015)
Money, A.G., Agius, H.: Video summarization: a conceptual framework and survey of the state of the art. J. Vis. Commun. Representation 19(2), 121–143 (2008)
Cahuina, E.J., Chavez, G.C.: A new method for static video summarization using local descriptors and video temporal segmentation. In: Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 226–233. IEEE (2013)
Hu, W., Xie, N., Li, L., Zeng, X., Maybank, S.: A survey on visual content-based video indexing and retrieval. IEEE Trans. Syst. Man Cybern. Part C: Appl. Rev. 41(6), 797–819 (2011)
Zhuang, Y., Rui, Y., Huang, T., Mehrotra, S.: Adaptive key frame extraction using unsupervised clustering. In: International Conference on Image Processing (ICIP), pp. 866–870. IEEE (1998)
De Avilla, S.E.F., Lopes, A.P.B., Luz Jr., A.L., Araujo, A.A.: VSUMM: a mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recogn. Lett. 32(1), 56–68 (2011)
Wan, T., Qin, Z.: A new technique for summarizing video sequences through histogram evolution, pp. 1–5. IEEE (2010)
Mademlis, I., Nikolaidis, N., Pitas, I.: Stereoscopic video description for key-frame extraction in movie summarization. In: European Signal Processing Conference (EUSIPCO), pp. 819–823. IEEE (2015)
Li, J.: Video shot segmentation and key frame extraction based on SIFT feature. In: International Conference on Image Analysis and Signal Processing (IASP), pp. 1–8. IEEE (2012)
Lowe, D.G.: Object recognition from local scale-invariant features. In: International Conference on Computer Vision (ICCV), pp. 1150–1157. IEEE (1999)
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: European Conference on Computer Vision (ECCV), pp. 1–2 (2004)
Tian, Z., Xue, J., Lan, X., Li, C., Zheng, N.: Key object-based static video summarization. In: ACM International Conference on Multimedia, pp. 1301–1304 (2011)
Cernekova, Z., Pitas, I., Nikou, C.: Information theory-based shot cut/fade detection and video summarization. IEEE Trans. Circuits Syst. Video Technol. 16(1), 82–91 (2006)
Fu, W., Wang, J., Gui, L., Lu, H., Ma, S.: Online video synopsis of structured motion. Neurocomputing 135, 155–162 (2014)
Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action recognition by Dense Trajectories. In: IEEE Conference on Computer Vision & Pattern Recognition (CVPR), pp. 3169–3176 (2011)
Mademlis, I., Iosifidis, A., Tefas, A., Nikolaidis, N., Pitas, I.: Exploiting stereoscopic disparity for augmenting human activity recognition performance. Multimedia Tools Appl. 75, 1–20 (2015)
Kourous, N., Iosifidis, A., Tefas, A., Nikolaidis, N., Pitas, I.: Video characterization based on activity clustering. In: International Conference on Electrical and Computer Engineering (ICECE), pp. 266–269. IEEE (2014)
Kim, H., Hilton, A.: Influence of colour and feature geometry on multi-modal 3D point clouds data registration. In: International Conference on 3D Vision (3DV), pp. 202–209 (2014)
Penatti, O., Valle, E., da Silva Torres, R.: Comparative study of global color and texture descriptors for Web image retrieval. J. Vis. Commun. Image Representation 23(2), 359–380 (2012)
Arthur, D., Vassilvitskii, S.: K-Means++: the advantages of careful seeding. In: Symposium on Discrete Algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics (2007)
Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 363–370. Springer, Heidelberg (2003). doi:10.1007/3-540-45103-X_50
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Mademlis, I., Tefas, A., Nikolaidis, N., Pitas, I. (2017). Compact Video Description and Representation for Automated Summarization of Human Activities. In: Angelov, P., Manolopoulos, Y., Iliadis, L., Roy, A., Vellasco, M. (eds) Advances in Big Data. INNS 2016. Advances in Intelligent Systems and Computing, vol 529. Springer, Cham. https://doi.org/10.1007/978-3-319-47898-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-47898-2_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47897-5
Online ISBN: 978-3-319-47898-2
eBook Packages: EngineeringEngineering (R0)