Abstract
This paper presents a prototype of an interactive video search tool for the preparation of MMM 2021 Video Browser Showdown (VBS). Our tool is tailored to enable searching for the public V3C1 dataset associated with various analysis results including detected objects, speech recognition, and visual features. It supports two types of searches: text-based and visual-based. With a text-based search, the tool enables users for querying videos using their textual descriptions, while with a visual-based search, one provides a video example to search for similar videos. Metadata extracted by recent state-of-the-art computer vision algorithms for object detection and visual features are used for accurate search. For an efficient search, the metadata are managed in two database engines: Whoosh and PostgreSQL. The tool also enables users to refine the search results by providing relevance feedback and customizing the intermediate analysis of the query inputs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
When considering only nouns, the size of the vocabulary of DenseCap is 849 objects.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
If a keyframe does not match the detected objects keyword but matches the other query keywords, it is omitted from the retrieved keyframes.
References
Alfarrarjeh, A., et al.: Hybrid indexes for spatial-visual search. In: Thematic Workshops of ACM MM, pp. 75–83 (2017)
Alfarrarjeh, A., et al.: A data-centric approach for image scene localization. In: Big Data, pp. 594–603. IEEE (2018)
Alfarrarjeh, A., et al.: A class of R*-tree indexes for spatial-visual search of geo-tagged street images. In: ICDE, pp. 1990–1993. IEEE (2020)
Berns, F., et al.: V3C1 dataset: an evaluation of content characteristics. In: ICMR, pp. 334–338 (2019)
Explosion Inc.: spaCy (2020). https://spacy.io/
Johnson, J., et al.: DenseCap: fully convolutional localization networks for dense captioning. In: CVPR, pp. 4565–4574 (2016)
Kasutani, E., Yamada, A.: The MPEG-7 color layout descriptor: a compact image feature description for high-speed image/video segment retrieval. In: ICIP, vol. 1, pp. 674–677. IEEE (2001)
Kim, S.H., et al.: MediaQ: mobile multimedia management system. In: MMSys, pp. 224–235 (2014)
Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94
Nazir, A., et al.: Content based image retrieval system by using HSV color histogram, discrete wavelet transform and edge histogram descriptor. In: iCoMET, pp. 1–6. IEEE (2018)
Smith, R.: An overview of the tesseract OCR engine. In: ICDAR, vol. 2, pp. 629–633. IEEE (2007)
Tolias, G., et al.: Particular object retrieval with integral max-pooling of CNN activations. In: ICLR (2016)
Yee, K.P., et al.: Faceted metadata for image search and browsing. In: HCI, pp. 401–408 (2003)
Zoph, B., et al.: Learning transferable architectures for scalable image recognition. In: CVPR, pp. 8697–8710 (2018)
Acknowledgment
This research has been supported in part by the USC Integrated Media Systems Center and unrestricted cash gifts from Oracle. The authors also acknowledge the USC Center for Advanced Research Computing (CARC) for providing computing resources for conducting some of the experiments. Also, thanks to Dr. Aiichiro Nakno for his help in using CARC.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Alfarrarjeh, A., Yoon, J., Kim, S.H., Abu Jabal, A., Nagaraj, A., Siddaramaiah, C. (2021). An Interactive Video Search Tool: A Case Study Using the V3C1 Dataset. In: Lokoč, J., et al. MultiMedia Modeling. MMM 2021. Lecture Notes in Computer Science(), vol 12573. Springer, Cham. https://doi.org/10.1007/978-3-030-67835-7_43
Download citation
DOI: https://doi.org/10.1007/978-3-030-67835-7_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67834-0
Online ISBN: 978-3-030-67835-7
eBook Packages: Computer ScienceComputer Science (R0)