Advertisement

An Interactive Video Search Platform for Multi-modal Retrieval with Advanced Concepts

  • Nguyen-Khang Le
  • Dieu-Hien Nguyen
  • Minh-Triet TranEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11962)

Abstract

The previous version of our retrieval system has shown some significant results in some retrieval tasks such as Lifelog’s moment retrieval tasks. In this paper, we adapt our platform to the Video Browser Showdown’s KIS and AVS tasks and present how our system performs in video search tasks. In addition to the smart features in our retrieval system that take advantage of the provided analysis data, we enhance the data with object color detection by employing Mask R-CNN and clustering. In this version of our search system, we try to extract the location information of the entities appearing in the videos and aim to exploit the spatial relationship between these entities. We also focus on designing efficient user interaction and a high-performance way to transfer data in the system to minimize the retrieval time.

Keywords

Retrieval system User interaction Concept detection 

Notes

Acknowledgement

Research is supported by Vingroup Innovation Foundation (VINIF) in project code VINIF.2019.DA19. We would like to thank AIOZ Pte Ltd for supporting our research team with computing infrastructure.

References

  1. 1.
    Lifelog moment retrieval with advanced semantic extraction and flexible moment visualization for exploration. In: CEUR Workshop Proceedings, Lugano, Switzerland, 09–12 September 2019, vol. 2380 (2019). CEUR-WS.org http://ceur-ws.org
  2. 2.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR 2009 (2009)Google Scholar
  3. 3.
    Gurrin, C., et al.: Overview of the NTCIR-14 lifelog-3 task. In: Proceedings of the Fourteenth NTCIR Conference (NTCIR-14) (2019)Google Scholar
  4. 4.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October 2017, pp. 2980–2988 (2017)Google Scholar
  5. 5.
    Le, N.K., Nguyen, D.H., Tran, M.T.: Smart lifelog retrieval system with habit-based concepts and moment visualization. In: LSC 2019 @ ICMR 2019 (2019)Google Scholar
  6. 6.
    Lokoč, J., et al.: Interactive search or sequential browsing? A detailed analysis of the video browser showdown 2018. ACM Trans. Multimed. Comput. Commun. Appl. 15(1), 29:1–29:18 (2019).  https://doi.org/10.1145/3295663CrossRefGoogle Scholar
  7. 7.
    Lokoč, J., Bailer, W., Schoeffmann, K., Muenzer, B., Awad, G.: On influential trends in interactive video retrieval: video browser showdown 2015–2017. IEEE Trans. Multimed. 20(12), 3361–3376 (2018).  https://doi.org/10.1109/TMM.2018.2830110CrossRefGoogle Scholar
  8. 8.
    Rossetto, L., Schuldt, H., Awad, G., Butt, A.A.: V3C – a research video collection. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, W.-H., Vrochidis, S. (eds.) MMM 2019. LNCS, vol. 11295, pp. 349–360. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-05710-7_29CrossRefGoogle Scholar
  9. 9.
    Schoeffmann, K.: A user-centric media retrieval competition: the video browser showdown 2012–2014. IEEE Multimed. 21(4), 8–13 (2014).  https://doi.org/10.1109/MMUL.2014.56CrossRefGoogle Scholar
  10. 10.
    Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40, 1452–1464 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Nguyen-Khang Le
    • 1
  • Dieu-Hien Nguyen
    • 1
  • Minh-Triet Tran
    • 1
    Email author
  1. 1.University of Science, VNU-HCMHo Chi Minh CityVietnam

Personalised recommendations