Abstract
This paper presents a new approach to video clip retrieval using the Earth Mover’s Distance (EMD). The approach builds on the many-to-many match methodology between two graph-based representations. The problem of measuring similarity between two clips is formulated as a graph matching task in two stages. First, a bipartite graph with spatio-temporal neighbourhood is constructed to explore the relation between data points and estimate the relevance between a pair of video clips. Secondly, using the EMD, the problem of matching a clip pair is converted to computing the minimum cost of transportation within the spatio-temporal graph. Experimental results on the UCF YouTube Action dataset show that the presented work attained a significant improvement in retrieval capability over conventional techniques.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Chen, L., Chua, T.S.: A match and tiling approach to content-based video retrieval. In: Proceedings of IEEE International Conference on Multimedia and Expo. (2001)
Zaslavskiy, M., Bach, F., Vert, J.-P.: Many-to-many graph matching: a continuous relaxation approach. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010, Part III. LNCS, vol. 6323, pp. 515–530. Springer, Heidelberg (2010)
Zhou, F., de la Torre, F.: Factorized graph matching. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (2012)
Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision (2000)
van der Maaten, L.J.P., Postma, E.O., van den Herik, H.J.: Dimensionality reduction: A Comparative Review (2008)
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos in the wild. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (2009)
Al Ghamdi, M., Al Harbi, N., Gotoh, Y.: Spatio-temporal video representation with locality-constrained linear coding. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012 Ws/Demos, Part III. LNCS, vol. 7585, pp. 101–110. Springer, Heidelberg (2012)
Scovanner, P., Ali, S., Shah, M.: A 3-dimensional SIFT descriptor and its application to action recognition. In: Proceedings of ACM Multimedia (2007)
Plummer, D., Lovász, L.: Matching theory. Elsevier Science (1986)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ghamdi, M.A., Gotoh, Y. (2014). Video Clip Retrieval by Graph Matching. In: de Rijke, M., et al. Advances in Information Retrieval. ECIR 2014. Lecture Notes in Computer Science, vol 8416. Springer, Cham. https://doi.org/10.1007/978-3-319-06028-6_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-06028-6_34
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06027-9
Online ISBN: 978-3-319-06028-6
eBook Packages: Computer ScienceComputer Science (R0)