Sequence Kernels for Clustering and Visualizing Near Duplicate Video Segments

Bailer, Werner

doi:10.1007/978-3-642-27355-1_36

Werner Bailer²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7131))

Included in the following conference series:

International Conference on Multimedia Modeling

1993 Accesses

Abstract

Organizing and visualizing video collections containing a high number of near duplicates is an important problem in film and video post-production. While kernels for matching sequences of feature vectors have been used e.g. for classification of video segments, kernel-based methods have not yet been applied to matching near duplicate video segments. In this paper we survey the application of six sequence-based kernels to clustering near duplicate video segments using kernel k-means and hierarchical clustering, and the application of kernel PCA for generating content visualizations for browsing. Evaluation on the TRECVID 2007 BBC rushes data set shows that the results of the kernel based methods are comparable to other approaches for matching near duplicates, eliminating differences between dynamic time warping and string matching. These results show that hierarchical clustering outperforms kernel k-means. We also show that well-arranged visualizations of both single- and multi-view content sets can be obtained using kernel PCA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bailer, W.: A Feature Sequence Kernel for Video Concept Classification. In: Lee, K.-T., Tsai, W.-H., Liao, H.-Y.M., Chen, T., Hsieh, J.-W., Tseng, C.-C. (eds.) MMM 2011 Part I. LNCS, vol. 6523, pp. 359–369. Springer, Heidelberg (2011)
Chapter Google Scholar
Bailer, W., Lee, F., Thallinger, G.: A distance measure for repeated takes of one scene. The Visual Computer 25(1), 53–68 (2009)
Article Google Scholar
Ballan, L., Bertini, M., Del Bimbo, A., Serra, G.: Video event classification using string kernels. Multimedia Tools Appl. 48(1), 69–87 (2010)
Article Google Scholar
Choi, J., Jeon, W.J., Lee, S.-C.: Spatio-temporal pyramid matching for sports videos. In: Proc. 1st ACM International Conference on Multimedia Information Retrieval, pp. 291–297. ACM, New York (2008)
Google Scholar
Cuturi, M., Vert, J.-P., Birkenes, O., Matsui, T.: A kernel for time series based on global alignments. Computing Research Repository, abs/cs/0610033 (2006)
Google Scholar
Dhillon, I.S., Guan, Y., Kulis, B.: Kernel k-means: spectral clustering and normalized cuts. In: KDD, pp. 551–556 (2004)
Google Scholar
Djordjevic, D., Izquierdo, E.: Relevance feedback for image retrieval in structured multi-feature spaces. In: Proc. MobiCom (2006)
Google Scholar
Dumont, E., Mérialdo, B.: Rushes video parsing using video sequence alignment. In: Proc. CBMI 2009 (June 2009)
Google Scholar
Grauman, K., Darrell, T.: The pyramid match kernel: Discriminative classification with sets of image features. In: IEEE ICCV, vol. 2 (2005)
Google Scholar
Grauman, K., Darrell, T.: Approximate correspondences in high dimensions. In: NIPS, pp. 505–512 (2006)
Google Scholar
Grauman, K., Darrell, T.: The pyramid match kernel: Efficient learning with sets of features. J. Mach. Learn. Res. 8, 725–760 (2007)
MATH Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)
Google Scholar
Liu, Y., Zhou, F., Liu, W., De La Torre, F., Liu, Y.: Unsupervised summarization of rushes videos. In: Proc. ACM Multimedia, pp. 751–754 (2010)
Google Scholar
Myers, C.S., Rabiner, L.R.: A comparative study of several dynamic time-warping algorithms for connected word recognition. The Bell System Technical Journal 60(7), 1389–1409 (1981)
Article Google Scholar
NHK Science & Technical Research Laboratories. Test modules for TRECVID activity. Use case scenario. Ver.1.2.0E (April 2008)
Google Scholar
Over, P., Smeaton, A.F., Awad, G.: The TRECVID 2008 BBC rushes summarization evaluation. In: Proceedings of the 2nd ACM TRECVid Video Summarization Workshop, TVS 2008, pp. 1–20. ACM, New York (2008)
Google Scholar
Rahimi, A., Kiran, R.: How earth mover’s distance comprares two bags. Technical report, Intel Labs Berkeley (2007)
Google Scholar
Ricci, E., Tobia, F., Zen, G.: Learning pedestrian trajectories with kernels. In: ICPR, pp. 149–152 (2010)
Google Scholar
Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. of Computer Vision 40(2), 99–121 (2000)
Article MATH Google Scholar
Schölkopf, B., Smola, A., Müller, K.-R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10(5) (1998)
Google Scholar
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge Univ. Press (2004)
Google Scholar
Shimodaira, H., Noma, K.-I., Nakai, M., Sagayama, S.: Dynamic time-alignment kernel in support vector machine. In: NIPS (2001)
Google Scholar
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: Proc. 8th ACM International Workshop on Multimedia Information Retrieval, pp. 321–330 (2006)
Google Scholar
Xu, D., Chang, S.-F.: Visual event recognition in news video using kernel methods with multi-level temporal alignment. In: IEEE CVPR (2007)
Google Scholar
Xu, D., Chang, S.-F.: Video event recognition using kernel methods with multilevel temporal alignment. IEEE Trans. Pattern Anal. Mach. Intell. 30 (2008)
Google Scholar
Yeh, M.-C., Cheng, K.-T.: A string matching approach for visual retrieval and classification. In: Proc. 1st ACM International Conference on Multimedia Information Retrieval, pp. 52–58. ACM, New York (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

DIGITAL – Institute for Information and Communication Technologies, Joanneum Research Forschungsgesellschaft mbH, Graz, Austria
Werner Bailer

Authors

Werner Bailer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Information Technology, Alpen-Adria-Universität Klagenfurt, Universitätsstr. 65-67, 9020, Klagenfurt, Austria
Klaus Schoeffmann
EURECOM, 2229 Rout des Crêtes, BP 193, 06904, Sophia Antipolis Cedex, France
Bernard Merialdo
School of Computer Science, Carnegie Mellon University, 5000 Forbes Ave, 15213-3890, Pittsburgh, PA, USA
Alexander G. Hauptmann
Department of Computer Science, City University of Hong Kong, Tat Chee Ave, Kowloon, Hong Kong
Chong-Wah Ngo
Department of Electronic and Electrical Engineering, University College London, Roberts Building, Torrington Place, WC1E 7JE, London, UK
Yiannis Andreopoulos
Institute of Software Technology and Interactive Systems, Vienna University of Technology, Favoritenstrasse 9-11 188/2, 1040, Vienna, Austria
Christian Breiteneder

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bailer, W. (2012). Sequence Kernels for Clustering and Visualizing Near Duplicate Video Segments. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, CW., Andreopoulos, Y., Breiteneder, C. (eds) Advances in Multimedia Modeling. MMM 2012. Lecture Notes in Computer Science, vol 7131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27355-1_36

Download citation

DOI: https://doi.org/10.1007/978-3-642-27355-1_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27354-4
Online ISBN: 978-3-642-27355-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics