An Interactive Video Search Tool: A Case Study Using the V3C1 Dataset

Alfarrarjeh, Abdullah; Yoon, Jungwon; Kim, Seon Ho; Abu Jabal, Amani; Nagaraj, Akarsh; Siddaramaiah, Chinmayee

doi:10.1007/978-3-030-67835-7_43

Abdullah Alfarrarjeh¹⁶,
Jungwon Yoon¹⁵,
Seon Ho Kim¹⁵,
Amani Abu Jabal¹⁶,
Akarsh Nagaraj¹⁵ &
…
Chinmayee Siddaramaiah¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12573))

Included in the following conference series:

International Conference on Multimedia Modeling

1788 Accesses

Abstract

This paper presents a prototype of an interactive video search tool for the preparation of MMM 2021 Video Browser Showdown (VBS). Our tool is tailored to enable searching for the public V3C1 dataset associated with various analysis results including detected objects, speech recognition, and visual features. It supports two types of searches: text-based and visual-based. With a text-based search, the tool enables users for querying videos using their textual descriptions, while with a visual-based search, one provides a video example to search for similar videos. Metadata extracted by recent state-of-the-art computer vision algorithms for object detection and visual features are used for accurate search. For an efficient search, the metadata are managed in two database engines: Whoosh and PostgreSQL. The tool also enables users to refine the search results by providing relevance feedback and customizing the intermediate analysis of the query inputs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.statista.com/statistics/259477/hours-of-video-uploaded-to-youtube-every-minute/.
2.
When considering only nouns, the size of the vocabulary of DenseCap is 849 objects.
3.
https://developer.apple.com/documentation/coreimage/cifacefeature.
4.
https://pypi.org/project/langdetect/.
5.
https://www.nltk.org/book/ch05.html.
6.
https://www.postgresql.org/docs/10/cube.html.
7.
https://www.postgresql.org/docs/11/textsearch-indexes.html.
8.
If a keyframe does not match the detected objects keyword but matches the other query keywords, it is omitted from the retrieved keyframes.

References

Alfarrarjeh, A., et al.: Hybrid indexes for spatial-visual search. In: Thematic Workshops of ACM MM, pp. 75–83 (2017)
Google Scholar
Alfarrarjeh, A., et al.: A data-centric approach for image scene localization. In: Big Data, pp. 594–603. IEEE (2018)
Google Scholar
Alfarrarjeh, A., et al.: A class of R*-tree indexes for spatial-visual search of geo-tagged street images. In: ICDE, pp. 1990–1993. IEEE (2020)
Google Scholar
Berns, F., et al.: V3C1 dataset: an evaluation of content characteristics. In: ICMR, pp. 334–338 (2019)
Google Scholar
Explosion Inc.: spaCy (2020). https://spacy.io/
Johnson, J., et al.: DenseCap: fully convolutional localization networks for dense captioning. In: CVPR, pp. 4565–4574 (2016)
Google Scholar
Kasutani, E., Yamada, A.: The MPEG-7 color layout descriptor: a compact image feature description for high-speed image/video segment retrieval. In: ICIP, vol. 1, pp. 674–677. IEEE (2001)
Google Scholar
Kim, S.H., et al.: MediaQ: mobile multimedia management system. In: MMSys, pp. 224–235 (2014)
Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94
Article Google Scholar
Nazir, A., et al.: Content based image retrieval system by using HSV color histogram, discrete wavelet transform and edge histogram descriptor. In: iCoMET, pp. 1–6. IEEE (2018)
Google Scholar
Smith, R.: An overview of the tesseract OCR engine. In: ICDAR, vol. 2, pp. 629–633. IEEE (2007)
Google Scholar
Tolias, G., et al.: Particular object retrieval with integral max-pooling of CNN activations. In: ICLR (2016)
Google Scholar
Yee, K.P., et al.: Faceted metadata for image search and browsing. In: HCI, pp. 401–408 (2003)
Google Scholar
Zoph, B., et al.: Learning transferable architectures for scalable image recognition. In: CVPR, pp. 8697–8710 (2018)
Google Scholar

Download references

Acknowledgment

This research has been supported in part by the USC Integrated Media Systems Center and unrestricted cash gifts from Oracle. The authors also acknowledge the USC Center for Advanced Research Computing (CARC) for providing computing resources for conducting some of the experiments. Also, thanks to Dr. Aiichiro Nakno for his help in using CARC.

Author information

Authors and Affiliations

Integrated Media Systems Center, University of Southern California, Los Angeles, USA
Jungwon Yoon, Seon Ho Kim, Akarsh Nagaraj & Chinmayee Siddaramaiah
Department of Computer Science, German Jordanian University, Amman, Jordan
Abdullah Alfarrarjeh & Amani Abu Jabal

Authors

Abdullah Alfarrarjeh
View author publications
You can also search for this author in PubMed Google Scholar
Jungwon Yoon
View author publications
You can also search for this author in PubMed Google Scholar
Seon Ho Kim
View author publications
You can also search for this author in PubMed Google Scholar
Amani Abu Jabal
View author publications
You can also search for this author in PubMed Google Scholar
Akarsh Nagaraj
View author publications
You can also search for this author in PubMed Google Scholar
Chinmayee Siddaramaiah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abdullah Alfarrarjeh .

Editor information

Editors and Affiliations

Charles University, Prague, Czech Republic
Jakub Lokoč
Charles University, Prague, Czech Republic
Tomáš Skopal
Klagenfurt University, Klagenfurt, Austria
Klaus Schoeffmann
CERTH-ITI, Thessaloniki, Greece
Vasileios Mezaris
Renmin University of China, Beijing, China
Xirong Li
CERTH-ITI, Thessaloniki, Greece
Stefanos Vrochidis
Queen Mary University of London, London, UK
Ioannis Patras

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alfarrarjeh, A., Yoon, J., Kim, S.H., Abu Jabal, A., Nagaraj, A., Siddaramaiah, C. (2021). An Interactive Video Search Tool: A Case Study Using the V3C1 Dataset. In: Lokoč, J., et al. MultiMedia Modeling. MMM 2021. Lecture Notes in Computer Science(), vol 12573. Springer, Cham. https://doi.org/10.1007/978-3-030-67835-7_43

Download citation

DOI: https://doi.org/10.1007/978-3-030-67835-7_43
Published: 21 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67834-0
Online ISBN: 978-3-030-67835-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics