Abstract
Numerous web videos associated with rich metadata are available on the Internet today. While such metadata like video tags bring us facilitations and opportunities for video search and multimedia content understanding, some challenges also arise due to the fact that those video tags are usually annotated at the video level while many tags actually only describe parts of the video content. Thus how to localize the relevant parts or frames of web video for given tags is the key to many applications and research tasks. In this paper we propose to combine topic model and relevance filtering to localize relevant frames. Our method is designed in three steps. First we apply relevance filtering to assign relevance scores to video frames and a raw relevant frame set is obtained by selecting the top ranked frames. Then we separate the frames into topics by mining the underlying semantics using Latent Dirichlet Allocation and use the raw relevance set as validation set to select relevant topics. Finally, the topical relevances are used to refine the raw relevant frame set and the final results are obtained. Experiment results on real web videos validate the effectiveness of the proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ulges, A., Schulze, C., Koch, M., Breuel, T.: Learning automatic concept detectors from online video. Computer Vision and Image Understanding 114(4), 428–438 (2010)
Ulges, A., Schulze, C., Breuel, T.: Identifying Relevant Frames in Weakly Labeled Videos for Training Concept Detectors. In: Proc. ACM Conference on Image and Video Retrieval (2008)
Borth, D., Ulges, A., Breuel, T.: Relevance Filtering meets Active Learning: Improving Web-based Concept Detectors. In: Proc. International Conference on Multimedia Information Retrieval (2010)
Ulges, A., Schulze, C., Breuel, T.: Multiple Instance Learning from Weakly Labeled Videos. In: SAMT Workshop on Cross-Media Information Analysis and Retrieval (2008)
Ballan, L., Bertini, M., Del Bimbo, A., Meoni, M., Serra, G.: Tag suggestion and localization in user-generated videos based on social knowledge. In: Proc. ACM Multimedia Intl Workshop on Social Media (2010)
Zhang, M.-L., Zhou, Z.-H.: Improve Multi-Instance Neural Networks through Feature Selection. Neural Process Letters 19(1), 1–10 (2004)
Shen, J., Cheng, Z.: Personalized video similarity measure. Multimedia Syst. 17(5), 421–433 (2011)
Wang, M., Hua, X.-S., Tang, J., Hong, R.: Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation. IEEE Transactions on Multimedia 11(3), 465–476 (2009)
Shen, J., Tao, D., Li, X.: Modality Mixture Projections for Semantic Video Event Detection. IEEE Trans. Circuits Syst. Video Techn. 18(11), 1587–1596 (2008)
Wang, M., Yang, K., Hua, X.-S., Zhang, H.-J.: Towards a Relevant and Diverse Search of Social Images. IEEE Transactions on Multimedia 12(8), 829–842 (2010)
Yanai, K.: Automatic Web Image Selection with a Probabilistic Latent Topic Model. In: Proc. of the Seventeenth International World Wide Web Conference, Poster Paper (2008)
Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning Object Categories from Google’s Image Search. In: Proc. of the 10th Inter. Conf. on Computer Vision (2005)
Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Wang, C., Blei, D., Fei-Fei, L.: Simultaneous Image Classification and Annotation. In: Proc. Computer Vision and Pattern Recognition (2009)
Feng, Y., Lapata, M.: Topic Models for Image Annotation and Text Illustration. In: Proc. Human Language Technologies (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yi, L., Li, H., Neo, SY. (2013). Combining Topic Model and Relevance Filtering to Localize Relevant Frames in Web Videos. In: Li, S., et al. Advances in Multimedia Modeling. Lecture Notes in Computer Science, vol 7733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35728-2_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-35728-2_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35727-5
Online ISBN: 978-3-642-35728-2
eBook Packages: Computer ScienceComputer Science (R0)