Combining Topic Model and Relevance Filtering to Localize Relevant Frames in Web Videos

Yi, Lei; Li, Haojie; Neo, Shi-Yong

doi:10.1007/978-3-642-35728-2_20

Lei Yi⁷,
Haojie Li⁷ &
Shi-Yong Neo⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7733))

1934 Accesses
1 Citations

Abstract

Numerous web videos associated with rich metadata are available on the Internet today. While such metadata like video tags bring us facilitations and opportunities for video search and multimedia content understanding, some challenges also arise due to the fact that those video tags are usually annotated at the video level while many tags actually only describe parts of the video content. Thus how to localize the relevant parts or frames of web video for given tags is the key to many applications and research tasks. In this paper we propose to combine topic model and relevance filtering to localize relevant frames. Our method is designed in three steps. First we apply relevance filtering to assign relevance scores to video frames and a raw relevant frame set is obtained by selecting the top ranked frames. Then we separate the frames into topics by mining the underlying semantics using Latent Dirichlet Allocation and use the raw relevance set as validation set to select relevant topics. Finally, the topical relevances are used to refine the raw relevant frame set and the final results are obtained. Experiment results on real web videos validate the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ulges, A., Schulze, C., Koch, M., Breuel, T.: Learning automatic concept detectors from online video. Computer Vision and Image Understanding 114(4), 428–438 (2010)
Article Google Scholar
Ulges, A., Schulze, C., Breuel, T.: Identifying Relevant Frames in Weakly Labeled Videos for Training Concept Detectors. In: Proc. ACM Conference on Image and Video Retrieval (2008)
Google Scholar
Borth, D., Ulges, A., Breuel, T.: Relevance Filtering meets Active Learning: Improving Web-based Concept Detectors. In: Proc. International Conference on Multimedia Information Retrieval (2010)
Google Scholar
Ulges, A., Schulze, C., Breuel, T.: Multiple Instance Learning from Weakly Labeled Videos. In: SAMT Workshop on Cross-Media Information Analysis and Retrieval (2008)
Google Scholar
Ballan, L., Bertini, M., Del Bimbo, A., Meoni, M., Serra, G.: Tag suggestion and localization in user-generated videos based on social knowledge. In: Proc. ACM Multimedia Intl Workshop on Social Media (2010)
Google Scholar
Zhang, M.-L., Zhou, Z.-H.: Improve Multi-Instance Neural Networks through Feature Selection. Neural Process Letters 19(1), 1–10 (2004)
Article Google Scholar
Shen, J., Cheng, Z.: Personalized video similarity measure. Multimedia Syst. 17(5), 421–433 (2011)
Article Google Scholar
Wang, M., Hua, X.-S., Tang, J., Hong, R.: Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation. IEEE Transactions on Multimedia 11(3), 465–476 (2009)
Article Google Scholar
Shen, J., Tao, D., Li, X.: Modality Mixture Projections for Semantic Video Event Detection. IEEE Trans. Circuits Syst. Video Techn. 18(11), 1587–1596 (2008)
Article Google Scholar
Wang, M., Yang, K., Hua, X.-S., Zhang, H.-J.: Towards a Relevant and Diverse Search of Social Images. IEEE Transactions on Multimedia 12(8), 829–842 (2010)
Article Google Scholar
Yanai, K.: Automatic Web Image Selection with a Probabilistic Latent Topic Model. In: Proc. of the Seventeenth International World Wide Web Conference, Poster Paper (2008)
Google Scholar
Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning Object Categories from Google’s Image Search. In: Proc. of the 10th Inter. Conf. on Computer Vision (2005)
Google Scholar
Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
MATH Google Scholar
Wang, C., Blei, D., Fei-Fei, L.: Simultaneous Image Classification and Annotation. In: Proc. Computer Vision and Pattern Recognition (2009)
Google Scholar
Feng, Y., Lapata, M.: Topic Models for Image Annotation and Text Illustration. In: Proc. Human Language Technologies (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Software, Dalian University of Technology, China
Lei Yi & Haojie Li
KAI Square Pte Ltd., Singapore
Shi-Yong Neo

Authors

Lei Yi
View author publications
You can also search for this author in PubMed Google Scholar
Haojie Li
View author publications
You can also search for this author in PubMed Google Scholar
Shi-Yong Neo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Asia, 5 Danling Street, 100080, Beijing, China
Shipeng Li & Tao Mei &
School of Electrical Engineering and Computer Science, University of Ottawa, 800 King Edward, K1N 6N5, Ottawa, ON, Canada
Abdulmotaleb El Saddik
School of Computer and Information, Hefei University of Technology, Road Tunxi 193#, 230009, Hefei, Anhui, China
Meng Wang & Richang Hong &
Department of Information Engineering and Computer Science, University of Trento, ommarive 14, 38100, Trento, Italy
Nicu Sebe
Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, 117583, Singapore, Singapore
Shuicheng Yan
School of Computing, CLARITY: Centre for Sensor Web Technologies, Dublin City University, Glasnevin, 9, Dublin, Ireland
Cathal Gurrin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yi, L., Li, H., Neo, SY. (2013). Combining Topic Model and Relevance Filtering to Localize Relevant Frames in Web Videos. In: Li, S., et al. Advances in Multimedia Modeling. Lecture Notes in Computer Science, vol 7733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35728-2_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-35728-2_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35727-5
Online ISBN: 978-3-642-35728-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics