What Is the Role of Similarity for Known-Item Search at Video Browser Showdown?
Across many domains, machine learning approaches start to compete with human experts in tasks originally considered as very difficult for automation. However, effective retrieval of general video shots still represents an issue due to their variability, complexity and insufficiency of training sets. In addition, users can face problems trying to formulate their search intents in a given query interface. Hence, many systems still rely also on interactive human-machine cooperation to boost effectiveness of the retrieval process. In this paper, we present our experience with known-item search tasks in the Video Browser Showdown competition, where participating interactive video retrieval systems mostly rely on various similarity models. We discuss the observed difficulty of known-item search tasks, categorize employed interaction components (relying on similarity models) and inspect successful interactive known-item searches from the recent iteration of the competition. Finally, open similarity search challenges for known-item search in video are presented.
KeywordsInteractive video retrieval Known-item search Similarity search
This paper has been supported by Czech Science Foundation (GAČR) project no. 17-22224S and by grant SVV-260451. This work is also supported by Universität Klagenfurt and Lakeside Labs GmbH, Klagenfurt, Austria and funding from the European Regional Development Fund and the Carinthian Economic Promotion Fund (KWF) under grant KWF 20214 u. 3520/26336/38165.
- 1.Awad, G., et al.: TRECVID 2017: evaluating ad-hoc and instance video search, events detection, video captioning and hyperlinking. In: Proceedings of TRECVID 2017. NIST, Gaithersburg (2017)Google Scholar
- 2.Bailer, W., Thallinger, G.: A framework for multimedia content abstraction and its application to rushes exploration. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, pp. 146–153 (2007)Google Scholar
- 8.Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
- 9.Lokoc, J., Bailer, W., Schoeffmann, K., Muenzer, B., Awad, G.: On influential trends in interactive video retrieval: video browser showdown 2015–2017. IEEE Trans. Multimed., 1 (2018). https://doi.org/10.1109/TMM.2018.2830110
- 13.Primus, M.J., Münzer, B., Leibetseder, A., Schoeffmann, K.: The ITEC collaborative video search system at the video browser showdown 2018. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10705, pp. 438–443. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73600-6_47CrossRefGoogle Scholar
- 15.Rui, Y., Huang, T.: A unified framework for video browsing and retrieval. In: Image and Video Processing Handbook, pp. 705–715 (2000)Google Scholar
- 18.Szegedy, C., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1–9 (2015)Google Scholar