SOM-Hunter: Video Browsing with Relevance-to-SOM Feedback Loop

Kratochvíl, Miroslav; Veselý, Patrik; Mejzlík, František; Lokoč, Jakub

doi:10.1007/978-3-030-37734-2_71

Miroslav Kratochvíl¹⁶,
Patrik Veselý¹⁶,
František Mejzlík¹⁶ &
…
Jakub Lokoč¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11962))

Included in the following conference series:

International Conference on Multimedia Modeling

2595 Accesses
40 Citations

Abstract

This paper presents a prototype video retrieval engine focusing on a simple known-item search workflow, where users initialize the search with a query and then use an iterative approach to explore a larger candidate set. Specifically, users gradually observe a sequence of displays and provide feedback to the system. The displays are dynamically created by a self organizing map that employs the scores based on the collected feedback, in order to provide a display matching the user preferences. In addition, users can inspect various other types of specialized displays for exploitation purposes, once promising candidates are found.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A hypernym represents a set of supported basic labels.

References

Barthel, K.U., Hezel, N., Jung, K.: Fusing keyword search and visual exploration for untagged videos. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10705, pp. 413–418. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73600-6_43
Chapter Google Scholar
Cox, I.J., Miller, M.L., Minka, T.P., Papathomas, T.V., Yianilos, P.N.: The Bayesian image retrieval system, pichunter: theory, implementation, and psychophysical experiments. IEEE Trans. Image Process. 9(1), 20–37 (2000)
Article Google Scholar
Donahue, J., et al.: DeCAF: a deep convolutional activation feature for generic visual recognition. In: Proceedings of the 31st International Conference on Machine Learning, ICML 2014, vol. 32, pp. 647–655. JMLR.org (2014)
Google Scholar
He, J., Shang, X., Zhang, H., Chua, T.S.: Mental visual browsing. In: Tian, Q., Sebe, N., Qi, G.J., Huet, B., Hong, R., Liu, X. (eds.) MMM 2016. LNCS, vol. 9517, pp. 424–428. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-27674-8_44
Chapter Google Scholar
Kohonen, T.: The self-organizing map. Neurocomputing 21(1–3), 1–6 (1998)
Article Google Scholar
Lokoč, J., Bailer, W., Schoeffmann, K., Münzer, B., Awad, G.: On influential trends in interactive video retrieval: video browser showdown 2015–2017. IEEE Trans. Multimedia 20(12), 3361–3376 (2018). https://doi.org/10.1109/TMM.2018.2830110
Article Google Scholar
Lokoč, J., et al.: Interactive search or sequential browsing? A detailed analysis of the video browser showdown 2018. ACM Trans. Multimedia Comput. Commun. Appl. 15(1), 29:1–29:18 (2019). https://doi.org/10.1145/3295663
Article Google Scholar
Lokoč, J., Kovalčík, G., Souček, T., Moravec, J., Čech, P.: A framework for effective known-item search in video. In: Proceedings of the 27th ACM International Conference on Multimedia, MM 2019, pp. 1777–1785. ACM, New York (2019). https://doi.org/10.1145/3343031.3351046
Rossetto, L., Amiri Parian, M., Gasser, R., Giangreco, I., Heller, S., Schuldt, H.: Deep learning-based concept detection in vitrivr. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 616–621. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05716-9_55
Chapter Google Scholar
Schoeffmann, K., Münzer, B., Leibetseder, A., Primus, J., Kletz, S.: Autopiloting feature maps: the deep interactive video exploration (diveXplore) system at VBS2019. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 585–590. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05716-9_50
Chapter Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, 7–12 June 2015, Boston, MA, USA, pp. 1–9 (2015)
Google Scholar
Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: lessons learned from the 2015 MSCOCO image captioning challenge. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 652–663 (2017)
Article Google Scholar
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. CoRR abs/1707.07012 (2017). http://arxiv.org/abs/1707.07012
Li, X., Xu, C., Yang, G., Chen, Z., Dong, J.: W2VV++: fully deep learning for ad-hoc video search. In: Proceedings of the 27th ACM International Conference on Multimedia, MM 2019, 21–25 October 2019, Nice, France, pp. 1786–1794 (2019). https://doi.org/10.1145/3343031.3350906

Download references

Acknowledgments

This paper has been supported by Czech Science Foundation (GAČR) project 19-22071Y and by Charles University grant SVV-260451. M.K. was supported by ELIXIR CZ (MEYS), grant number LM2015047.

We are extremely grateful to Vladimír Vondruš for his helpful advices on using the Magnum engine, and to Tomáš Souček and Gregor Kovalčík for their help with frame selection and feature extraction.

Author information

Authors and Affiliations

SIRET Research Group, Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
Miroslav Kratochvíl, Patrik Veselý, František Mejzlík & Jakub Lokoč

Authors

Miroslav Kratochvíl
View author publications
You can also search for this author in PubMed Google Scholar
Patrik Veselý
View author publications
You can also search for this author in PubMed Google Scholar
František Mejzlík
View author publications
You can also search for this author in PubMed Google Scholar
Jakub Lokoč
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Miroslav Kratochvíl or Jakub Lokoč .

Editor information

Editors and Affiliations

Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Yong Man Ro
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Junmo Kim
National Cheng Kung University, Tainan City, Taiwan
Wei-Ta Chu
Tsinghua University, Beijing, China
Peng Cui
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Jung-Woo Choi
National Tsing Hua University, Hsinchu, Taiwan
Min-Chun Hu
Ghent University, Ghent, Belgium
Wesley De Neve

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kratochvíl, M., Veselý, P., Mejzlík, F., Lokoč, J. (2020). SOM-Hunter: Video Browsing with Relevance-to-SOM Feedback Loop. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11962. Springer, Cham. https://doi.org/10.1007/978-3-030-37734-2_71

Download citation

DOI: https://doi.org/10.1007/978-3-030-37734-2_71
Published: 24 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37733-5
Online ISBN: 978-3-030-37734-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics