Mental Visual Browsing

He, Jun; Shang, Xindi; Zhang, Hanwang; Chua, Tat-Seng

doi:10.1007/978-3-319-27674-8_44

Jun He¹⁹,
Xindi Shang¹⁹,
Hanwang Zhang¹⁹ &
…
Tat-Seng Chua¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9517))

Included in the following conference series:

International Conference on Multimedia Modeling

1741 Accesses
3 Citations

Abstract

We present a surprisingly easy-to-use video browser for helping users to pinpoint a specific video shot in mind, within a long video. At each interactive iteration, the only user effort required is to click 1 shot, which most visually relates to the user’s mental target, out of 8 displayed shots. Then, the system updates the browsing model and display another 8 shots for the next iteration. The proposed system is underpinned by a theoretically-sound Bayesian framework that maintains the probabilities of all the video shots segmented from the long video. This framework guarantees that we can find the target shot out of around 1-h video within 3–5 iterations. We believe that our system will perform well in the Video Broswer Showdown game of MMM 2016.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Arandjelovic, R., Zisserman, A.: All about VLAD. In: CVPR (2013)
Google Scholar
Ferecatu, M., Geman, D.: A statistical framework for image category search from a mental picture. TPAMI 31(6), 1087–1101 (2009)
Article Google Scholar
Jégou, H., Perronnin, F., Douze, M., Sanchez, J., Perez, P., Schmid, C.: Aggregating local image descriptors into compact codes. TPAMI 34(9), 1704–1716 (2012)
Article Google Scholar
Jia, Y.: Caffe: an open source convolutional architecture for fast feature embedding (2013). http://caffe.berkeleyvision.org
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
Google Scholar
Over, P., Awad, G., Michel, M., Fiscus, J., Sanders, G., Shaw, B., Kraaij, W., Smeaton, A.F., Quenot, G.: Trecvid 2012 - an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: TRECVID (2012)
Google Scholar
Schoeffmann, K.: A user-centric media retrieval competition: the video browser showdown 2012–2014. IEEE MultiMedia 21, 8–13 (2014)
Article Google Scholar
Schoeffmann, K., Ahlström, D., Bailer, W., Cobârzan, C., Hopfgartner, F., McGuinness, K., Gurrin, C., Frisson, C., Le, D.-D., Del Fabro, M., et al.: The video browser showdown: a live evaluation of interactive video search tools. IJMIR 3(2), 113–127 (2014)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv preprint arXiv:1409.1556
Xu, Z., Yang, Y., Hauptmann, A.G.: A discriminative cnn video representation for event detection (2014). arXiv preprint arXiv:1411.4006

Download references

Author information

Authors and Affiliations

School of Computing, National University of Singapore, Singapore, Singapore
Jun He, Xindi Shang, Hanwang Zhang & Tat-Seng Chua

Authors

Jun He
View author publications
You can also search for this author in PubMed Google Scholar
Xindi Shang
View author publications
You can also search for this author in PubMed Google Scholar
Hanwang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Tat-Seng Chua
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xindi Shang .

Editor information

Editors and Affiliations

University of Texas at San Antonio, San Antonio, USA
Qi Tian
Dept. of Information Engineering, University of Trento, Povo, Trento, Italy
Nicu Sebe
EECS, University of Central Florida, Orlando, Florida, USA
Guo-Jun Qi
EURECOM, Sophia-Antipolis, France
Benoit Huet
Hefei University of Technology, Hefei, Anhui, China
Richang Hong
School of Computing and Information, Hefei University of Technology, Hefei, Anhui, China
Xueliang Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, J., Shang, X., Zhang, H., Chua, TS. (2016). Mental Visual Browsing. In: Tian, Q., Sebe, N., Qi, GJ., Huet, B., Hong, R., Liu, X. (eds) MultiMedia Modeling. MMM 2016. Lecture Notes in Computer Science(), vol 9517. Springer, Cham. https://doi.org/10.1007/978-3-319-27674-8_44

Download citation

DOI: https://doi.org/10.1007/978-3-319-27674-8_44
Published: 01 January 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27673-1
Online ISBN: 978-3-319-27674-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics