Advertisement

VIRET at Video Browser Showdown 2020

  • Jakub LokočEmail author
  • Gregor Kovalčík
  • Tomáš Souček
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11962)

Abstract

During the last three years, the most successful systems at the Video Browser Showdown employed effective retrieval models where raw video data are automatically preprocessed in advance to extract semantic or low-level features of selected frames or shots. This enables users to express their search intents in the form of keywords, sketch, query example, or their combination. In this paper, we present new extensions to our interactive video retrieval system VIRET that won Video Browser Showdown in 2018 and achieved the second place at Video Browser Showdown 2019 and Lifelog Search Challenge 2019. The new features of the system focus both on updates of retrieval models and interface modifications to help users with query specification by means of informative visualizations.

Notes

Acknowledgments

This paper has been supported by Czech Science Foundation (GAČR) project 19-22071Y and by Charles University grant SVV-260451. We would also like to thank Přemysl Čech and Vít Škrhák for their help with interface in WPF.

References

  1. 1.
    Amato, G., et al.: VISIONE at VBS2019. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 591–596. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-05716-9_51CrossRefGoogle Scholar
  2. 2.
    Andreadis, S., et al.: VERGE in VBS 2019. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 602–608. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-05716-9_53CrossRefGoogle Scholar
  3. 3.
    Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval - The Concepts and Technology Behind Search, 2nd edn. Pearson Education Ltd., Harlow (2011)Google Scholar
  4. 4.
    Barthel, K.U., Hezel, N.: Visually exploring millions of images using image maps and graphs. In: Huet, B., Vrochidis, S., Chang, E. (eds.) Big Data Analytics for Large-Scale Multimedia Search, pp. 251–275. John Wiley and Sons Inc. (2019)Google Scholar
  5. 5.
    Cobârzan, C., et al.: Interactive video search tools: a detailed analysis of the video browser showdown 2015. Multimed. Tools Appl. 76(4), 5539–5571 (2017).  https://doi.org/10.1007/s11042-016-3661-2CrossRefGoogle Scholar
  6. 6.
    Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (June 2009).  https://doi.org/10.1109/CVPR.2009.5206848
  7. 7.
    Dong, J., Li, X., Snoek, C.G.M.: Predicting visual features from text for image and video caption retrieval. IEEE Trans. Multimedia 20(12), 3377–3388 (2018).  https://doi.org/10.1109/TMM.2018.2832602CrossRefGoogle Scholar
  8. 8.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016). http://www.deeplearningbook.orgzbMATHGoogle Scholar
  9. 9.
    Gurrin, C., et al.: [invited papers] Comparing approaches to interactive lifelog search at the lifelog search challenge (lsc2018). ITE Trans. Med. Technol. Appl. 7(2), 46–59 (2019).  https://doi.org/10.3169/mta.7.46CrossRefGoogle Scholar
  10. 10.
    Li, X., Xu, C., Yang, G., Chen, Z., Dong, J.: W2VV++: fully deep learning for ad-hoc video search. In: Proceedings of the 27th ACM International Conference on Multimedia, MM 2019, Nice, France, 21–25 October 2019, pp. 1786–1794 (2019).  https://doi.org/10.1145/3343031.3350906
  11. 11.
    Lokoč, J., Bailer, W., Schoeffmann, K., Münzer, B., Awad, G.: On influential trends in interactive video retrieval: video browser showdown 2015–2017. IEEE Trans. Multimed. 20(12), 3361–3376 (2018).  https://doi.org/10.1109/TMM.2018.2830110CrossRefGoogle Scholar
  12. 12.
    Lokoč, J., et al.: Interactive search or sequential browsing? A detailed analysis of the video browser showdown 2018. ACM Trans. Multimed. Comput. Commun. Appl. 15(1), 29:1–29:18 (2019).  https://doi.org/10.1145/3295663CrossRefGoogle Scholar
  13. 13.
    Mettes, P., Koelma, D.C., Snoek, C.G.: The imagenet shuffle: Reorganized pre-training for video event detection. In: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, pp. 175–182. ICMR ’16, ACM, New York, NY, USA (2016).  https://doi.org/10.1145/2911996.2912036, http://doi.acm.org/10.1145/2911996.2912036
  14. 14.
    Lokoč, J., Kovalčík, G., Souček, T., Moravec, J., Čech, P.: A framework for effective known-item search in video. In: Proceedings of the 27th ACM International Conference on Multimedia, MM 2019, pp. 1777–1785, ACM, New York (2019).  https://doi.org/10.1145/3343031.3351046
  15. 15.
    Lokoč, J., Kovalčík, G., Souček, T., Moravec, J., Čech, P.: Viret: a video retrieval tool for interactive known-item search. In: Proceedings of the 2019 on International Conference on Multimedia Retrieval, ICMR 2019, pp. 177–181. ACM, New York (2019).  https://doi.org/10.1145/3323873.3325034
  16. 16.
    Nguyen, P.A., Ngo, C.-W., Francis, D., Huet, B.: VIREO @ video browser showdown 2019. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 609–615. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-05716-9_54CrossRefGoogle Scholar
  17. 17.
    Rossetto, L., Amiri Parian, M., Gasser, R., Giangreco, I., Heller, S., Schuldt, H.: Deep learning-based concept detection in vitrivr. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 616–621. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-05716-9_55CrossRefGoogle Scholar
  18. 18.
    Rossetto, L., Schuldt, H., Awad, G., Butt, A.A.: V3C – a research video collection. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11295, pp. 349–360. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-05710-7_29CrossRefGoogle Scholar
  19. 19.
    Schoeffmann, K., Hudelist, M.A., Huber, J.: Video interaction tools: a survey of recent work. ACM Comput. Surv. 48(1), 14:1–14:34 (2015).  https://doi.org/10.1145/2808796CrossRefGoogle Scholar
  20. 20.
    Schoeffmann, K., Münzer, B., Leibetseder, A., Primus, J., Kletz, S.: Autopiloting feature maps: the deep interactive video exploration (diveXplore) system at VBS2019. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11296, pp. 585–590. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-05716-9_50CrossRefGoogle Scholar
  21. 21.
    Thomee, B., Lew, M.S.: Interactive search in image retrieval: a survey. Int. J. Multimed. Inf. Retrieval 1(2), 71–86 (2012).  https://doi.org/10.1007/s13735-012-0014-4CrossRefGoogle Scholar
  22. 22.
    Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. CoRR abs/1707.07012 (2017). http://arxiv.org/abs/1707.07012

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Jakub Lokoč
    • 1
    Email author
  • Gregor Kovalčík
    • 1
  • Tomáš Souček
    • 1
  1. 1.SIRET Research Group, Department of Software Engineering, Faculty of Mathematics and PhysicsCharles UniversityPragueCzech Republic

Personalised recommendations