Advertisement

VIRET Tool Meets NasNet

  • Jakub Lokoč
  • Gregor Kovalčík
  • Tomáš Souček
  • Jaroslav Moravec
  • Jan Bodnár
  • Přemysl ČechEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11296)

Abstract

The results of the last Video Browser Showdown in Bangkok 2018 show that multimodal search with interactive query reformulation represents a competitive search strategy for all the evaluated task categories. Therefore, we plan to target the effectiveness of involved retrieval models by making use of the most recent deep network architectures in the new version of our interactive video retrieval VIRET tool. Specifically, we apply the NasNet deep convolutional neural network architecture for automatic annotation and similarity search in the set of selected frames from the provided video collection. In addition, we implement temporal sequence queries and subimage similarity search to provide higher query formulation flexibility for users.

Keywords

Known-item search Deep learning NasNet Interactive video retrieval 

Notes

Acknowledgments

This paper has been supported in part by Czech Science Foundation (GAČR) project Nr. 17-22224S and by Charles University grant SVV-260451.

References

  1. 1.
    Barthel, K.U., Hezel, N., Mackowiak, R.: Navigating a graph of scenes for exploring large video collections. In: Tian, Q., Sebe, N., Qi, G.-J., Huet, B., Hong, R., Liu, X. (eds.) MMM 2016. LNCS, vol. 9517, pp. 418–423. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-27674-8_43CrossRefGoogle Scholar
  2. 2.
    Blazek, A., Lokoc, J., Kubon, D.: Video hunter at VBS 2017. In: MultiMedia Modeling - 23rd International Conference, MMM 2017, Proceedings, Part II, Reykjavik, Iceland, 4–6 January 2017, pp. 493–498 (2017)Google Scholar
  3. 3.
    Čech, P., Maroušek, J., Lokoč, J., Silva, Y.N., Starks, J.: Comparing MapReduce-based k-NN similarity joins on hadoop for high-dimensional data. In: Cong, G., Peng, W.-C., Zhang, W.E., Li, C., Sun, A. (eds.) ADMA 2017. LNCS (LNAI), vol. 10604, pp. 63–75. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-69179-4_5CrossRefGoogle Scholar
  4. 4.
    Cobârzan, C., et al.: Interactive video search tools: a detailed analysis of the video browser showdown 2015. Multimedia Tools Appl. 76(4), 5539–5571 (2017)CrossRefGoogle Scholar
  5. 5.
    Hu, P., Ramanan, D.: Finding tiny faces. CoRR abs/1612.04402 (2016)Google Scholar
  6. 6.
    Lokoc, J., Bailer, W., Schoeffmann, K., Muenzer, B., Awad, G.: On influential trends in interactive video retrieval: video browser showdown 2015–2017. IEEE Trans. Multimedia 20(12), 3361–3376 (2018). https://ieeexplore.ieee.org/document/8352047CrossRefGoogle Scholar
  7. 7.
    Lokoč, J., Blažek, A., Skopal, T.: Signature-based video browser. In: Gurrin, C., Hopfgartner, F., Hurst, W., Johansen, H., Lee, H., O’Connor, N. (eds.) MMM 2014. LNCS, vol. 8326, pp. 415–418. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-04117-9_49CrossRefGoogle Scholar
  8. 8.
    Lokoč, J., Kovalčík, G., Souček, T.: Revisiting SIRET video retrieval tool. In: MultiMedia Modeling - 24th International Conference, MMM 2018, Bangkok, Thailand, Proceedings, Part II, 5–7 February 2018, pp. 419–424 (2018)Google Scholar
  9. 9.
    Lokoč, J., Souček, T., Kovalčík, G.: Using an interactive video retrieval tool for lifelog data. In: Proceedings of the 2018 ACM Workshop on the Lifelog Search Challenge, LSC 2018, pp. 15–19. ACM, New York (2018)Google Scholar
  10. 10.
    Nguyen, P.A., Lu, Y.-J., Zhang, H., Ngo, C.-W.: Enhanced VIREO KIS at VBS 2018. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10705, pp. 407–412. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-73600-6_42CrossRefGoogle Scholar
  11. 11.
    Primus, M.J., Münzer, B., Leibetseder, A., Schoeffmann, K.: The ITEC collaborative video search system at the video browser showdown 2018. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10705, pp. 438–443. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-73600-6_47CrossRefGoogle Scholar
  12. 12.
    Rossetto, L., Giangreco, I., Tănase, C., Schuldt, H., Dupont, S., Seddati, O.: Enhanced retrieval and browsing in the IMOTION system. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 469–474. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-51814-5_43CrossRefGoogle Scholar
  13. 13.
    Szegedy, C., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1–9 (2015)Google Scholar
  14. 14.
    Zhou, X., et al.: EAST: an efficient and accurate scene text detector. CoRR abs/1704.03155 (2017)Google Scholar
  15. 15.
    Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. CoRR abs/1707.07012 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Jakub Lokoč
    • 1
  • Gregor Kovalčík
    • 1
  • Tomáš Souček
    • 1
  • Jaroslav Moravec
    • 1
  • Jan Bodnár
    • 1
  • Přemysl Čech
    • 1
    Email author
  1. 1.SIRET Research Group, Department of Software Engineering, Faculty of Mathematics and PhysicsCharles UniversityPragueCzech Republic

Personalised recommendations