Content based video retrieval system using two stream convolutional neural network

Sowmyayani, S.; Rani, P. Arockia Jansi

doi:10.1007/s11042-023-14784-5

Content based video retrieval system using two stream convolutional neural network

Published: 27 February 2023

Volume 82, pages 24465–24483, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

S. Sowmyayani¹ &
P. Arockia Jansi Rani²

268 Accesses
2 Citations
Explore all metrics

Abstract

Nowadays capturing video through mobile phones, digital cameras and uploading it in social media is a trend. These videos do not have semantic tags. Searching these kinds of videos is difficult to web users. Content Based Video Retrieval (CBVR) helps to identify the most relevant videos for a given video query. The objective of the paper is retrieve most relevant videos for a given query video in reduced time. To meet the objective, this paper proposes an efficient video retrieval system using salient object detection and keyframe extraction methods to reduce the high dimensionality of video data. The spatio-temporal features are extracted using two-stream Convolutional Neural Network (CNN) and stored in a feature dataset. The salient objects are used to search the exact subject that is given as query. The relevant videos are identified through similarity matching of feature dataset that are created using the input dataset with the feature of query video. To reduce the complexity of similarity matching, the proposed method replaces feature dataset with classification score dataset. Experiments are conducted on TRECVID and CC_Web_Video datasets and evaluated using precision, recall, specificity, accuracy and f-score. The proposed method is compared with recent methods and proved its efficiency with approximately 99.68% precision rate on TRECVID dataset and 88.9% precision rate on CC_Web_Video dataset. The proposed outperforms most recent methods by 0.001 increase in mean Average Precision (mAP) on CC_Web_Video dataset and 4% increase in precision rate on TRECVID dataset. The computation time is reduced by 100 min on TRECVID and 200 min on CC_Web_Video datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Content-Based Video Retrieval Using Deep Learning Algorithms

Content based video retrieval using deep learning feature extraction by modified VGG_16

Article 26 June 2022

Exploring the Strengths of Neural Codes for Video Retrieval

References

Al-Ayyoub M, AlZu’bi S, Jararweh Y, Shehab MA, Gupta BB (2018) Accelerating 3D medical volume segmentation using GPUs. Multimed Tools Appl 77(4):4939–4958
Article Google Scholar
AlZu’bi S, Shehab M, Al-Ayyoub M, Jararweh Y, Gupta B (2020) Parallel implementation for 3d medical volume fuzzy segmentation. Pattern Recogn Lett 130:312–318
Article Google Scholar
Al-Zu’bi S, Hawashin B, Mughaid A, Baker T (2021) Efficient 3D medical image segmentation algorithm over a secured multimedia network. Multimed Tools Appl 80(11):16887–16905
Article Google Scholar
AlZu'bi S, Al-Qatawneh S, Alsmirat M (2018) Transferable hmm trained matrices for accelerating statistical segmentation time. In: 2018 fifth international conference on social networks analysis, management and security (SNAMS). IEEE, pp 172–176
Chapter Google Scholar
Asha S, Sreeraj M (2013) Content based video retrieval using SURF descriptor. In: Proc. 3rd Int. Conf. Adv. Comput. Commun., pp 212–215
Bian X, Lan R, Wang X, Chen C, Liu Z, Luo X, Lai KK (2021) Discriminative codebook hashing for supervised video retrieval. Comput Intell Neuroscie 2021
Charrière K, Quellec G, Lamard M, Coatrieux G, Cochener B, Cazuguel G (2014) Automated surgical step recognition in normalized cataract surgery videos. In: International conference of the IEEE engineering in medicine and biology society. IEEE, Chicago, pp 4647–4650. https://doi.org/10.1109/EMBC.2014.6944660
Chapter Google Scholar
Charrière K, Quellec G, Lamard M, Martiano D, Cazuguel G, Coatrieux G, Cochener B (2017) Real-time analysis of cataract surgery videos using statistical models. Multimed Tools Appl 76(21):22473–22491. https://doi.org/10.1007/s11042-017-4793-8
Article Google Scholar
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional nets. In Proc. BMVC.
Cheng H, Wang P, Qi C (2021) CNN retrieval based unsupervised metric learning for near-duplicated video retrieval. arXiv preprint arXiv:2105.14566
Chittajallu DR, Basharat A, Tunison P, Horvath S, Wells KO, Leeds SG, Fleshman JW, Sankaranarayanan G, Enquobahrie A (2019) Content-based retrieval of video segments from minimally invasive surgery videos using deep convolutional video descriptors and iterative query refinement. Med. Imag., Image-Guided Procedures, Robotic Interventions, Model., vol 10951, Art. no. 109512Q
Ding S, Qu S, Xi Y, Wan S (2019) A long video caption generation algorithm for big video data retrieval. Future Gener Comput Syst 93:583–595
Article Google Scholar
Diwakar M, Kumar M (2015) CT image denoising based on complex wavelet transform using local adaptive thresholding and bilateral filtering. In: Proceedings of the third international symposium on women in computing and informatics, pp 297–302
Diwakar M, Kumar M (2018) A review on CT image noise and its denoising. Biomed Signal Process Control 42:73–88
Article Google Scholar
Diwakar M, Kumar P (2019) Wavelet packet based CT image denoising using bilateral method and Bayes shrinkage rule. In: Handbook of multimedia information security: techniques and applications. Springer, Cham, pp 501–511
Chapter Google Scholar
Diwakar M, Kumar P (2020) Blind noise estimation-based CT image denoising in tetrolet domain. Int J Inf Comput Secur 12(2–3):234–252
Google Scholar
Diwakar M, Singh P (2020) CT image denoising using multivariate model and its method noise thresholding in non-subsampled shearlet domain. Biomed Signal Process Control 57:101754
Article Google Scholar
Diwakar M, Patel PK, Gupta K, Chauhan C (2013) Object tracking using joint enhanced color-texture histogram. In: 2013 IEEE second international conference on image information processing (ICIIP-2013). IEEE, pp 160–165
Chapter Google Scholar
Diwakar M, Verma A, Lamba S, Gupta H (2019) Inter-and intra-scale dependencies-based CT image denoising in curvelet domain. In: Soft computing: theories and applications. Springer, Singapore, pp 343–350
Chapter Google Scholar
Diwakar M, Kumar P, Singh AK (2020) CT image denoising using NLM and its method noise thresholding. Multimed Tools Appl 79(21):14449–14464
Article Google Scholar
Hawashin B, Alzubi S, Mughaid A, Fotouhi F, Abusukhon A (2020) An efficient cold start solution for recommender systems based on machine learning and user interests. In: 2020 seventh international conference on software defined systems (SDS). IEEE, pp 220–225
Chapter Google Scholar
Jiang B, Huang X, Yang C, Yuan J (2019) SLTFNet: A spatial and language-temporal tensor fusion network for video moment retrieval. Inf Process Manage 56(6):Art. no. 102104
Article Google Scholar
Khan MN, Alam A, Lee YK (2020) FALKON: large-scale content-based video retrieval utilizing deep-features and distributed in-memory computing. In: 2020 IEEE international conference on big data and smart computing (BigComp). IEEE, pp 36–43
Chapter Google Scholar
Kordopatis-Zilos G, Papadopoulos S, Patras I, Kompatsiaris I (2019) Visil: fine-grained spatiotemporal video similarity learning. In: Proceedings of the IEEE international conference on computer vision, pp 6351–6360
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: NIPS, pp 1106–1114
Kumar GN, Reddy V (2019) Key frame extraction using rough set theory for video retrieval. In: Soft computing and signal processing. Springer, pp 751–757
Chapter Google Scholar
Kumar P, Sehgal V, Chauhan DS, Diwakar M (2011) Clouds: concept to optimize the quality of service (QOS) for clusters. In: 2011 world congress on information and communication technologies. IEEE, pp 816–821
Chapter Google Scholar
Kumar V, Tripathi V, Pant B (2019) Learning compact spatio-temporal features for fast content based video retrieval. Int J Innov Technol Exploring Eng 9(2):2402–2409
Google Scholar
Lafi M, Hawashin B, AlZu'bi S (2021) Eliciting requirements from stakeholders' responses using natural language processing. Comput Model Eng Sci 127(1):99–116
Google Scholar
Liu Y, Sui A (2018) Research on feature dimensionality reduction in content based public cultural video retrieval. In: IEEE/ACIS 17th international conference on computer and information science (ICIS), pp 718–722. https://doi.org/10.1109/ICIS.2018.8466379
Chapter Google Scholar
Mohamadzadeh S, Farsi H (2016) Content based video retrieval based on hdwt and sparse representation. Image Anal Stereol 35(2):67–80
Article MathSciNet MATH Google Scholar
Mühling M, Meister M, Korfhage N, Wehling J, Hörth A, Ewerth R, Freisleben B (2016) Content-based video retrieval in historical collections of the german broadcasting archive. In: Fuhr N, Kovács L, Risse T, Nejdl W (eds) International conference on theory and practice of digital libraries. In: lecture notes in computer science, vol 9819. Springer International Publishing, Cham, pp 67–78. https://doi.org/10.1007/978-3-319-43997-6_6
Chapter Google Scholar
Naveen Kumar GS, Reddy VSK (2019) An efficient approach for video retrieval by spatio-temporal features. Int J Knowl-Based Intell Eng Syst 23(4):311–316
Google Scholar
Pereira RB, Plastino A, Zadrozny B, Merschmann LHC (2018) Categorizing feature selection methods for multi-label classification. Artif Intell Rev 49(1):57–78. https://doi.org/10.1007/s10462-016-9516-4
Article Google Scholar
Prathiba T, Kumari RSS (2021) Content based video retrieval system based on multimodal feature grouping by KFCM clustering algorithm to promote human-computer interaction. J Ambient Intell Human Comput 12:6215–6229. https://doi.org/10.1007/s12652-020-02190-w
Article Google Scholar
Ram RS, Prakash SA, Balaanand M et al (2020) Colour and orientation of pixel based video retrieval using IHBM similarity measure. Multimed Tools Appl 79:10199–10214. https://doi.org/10.1007/s11042-019-07805-9
Article Google Scholar
Ramezani M, Yaghmaee F (2018a) Motion pattern based representation for improving human action retrieval. Multimed Tools Appl 77(19):26009–26032. https://doi.org/10.1007/s11042-018-5835-6
Article Google Scholar
Rehman SU, Tu S, Huang Y, Rehman OU (2018) A benchmark dataset and learning high-level semantic embeddings of multimedia for crossmedia retrieval. IEEE Access 6:67176–67188
Article Google Scholar
Shao J, Wen X, Zhao B, Wang C, Xue X (2020) Context encoding for video retrieval with contrastive learning. arXiv preprint arXiv:2008.01334
Shao J, Wen X, Zhao B, Xue X (2021) Temporal context aggregation for video retrieval with contrastive learning. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3268–3278
Sharma P, Lal N, Diwakar M (2013) Text security using 2d cellular automata rules. In: Conference on advances in communication and control systems (CAC2S 2013). Atlantis Press, pp 363–368
Song J, Zhang H, Li X, Gao L, Wang M, Hong R (2018) Self-supervised video hashing with hierarchical binary auto-encoder. IEEE Trans Image Process 27(7):3210–3221. https://doi.org/10.1109/TIP.2018.2814344
Article MathSciNet MATH Google Scholar
Sowmyayani S, Arockia Jansi Rani P (2014) Adaptive GOP structure to H.264/AVC based on scene change. ICTACT J Image Video Process 5(1):868–872
Article Google Scholar
Spolaor N, Lee HD, Takaki WSR, Ensina LA, Coy CSR, Wu FC (2020) A systematic review on content-based video retrieval. Eng Appl Artif Intell 90:103557
Article Google Scholar
Tao JL, Zhang JM, Wang LJ, Shen XJ, Zha ZJ (2019) Near-duplicate video retrieval through Toeplitz Kernel partial least squares. In: Kompatsiaris I, Huet B, Mezaris V, Gurrin C, Cheng WH, Vrochidis S (eds) Multimedia modeling. MMM 2019. Lecture notes in computer science, vol 11296. Springer, Cham. https://doi.org/10.1007/978-3-030-05716-9_29
TRECVID: TREC Video Retrieval Evaluation. http://www-nlpir.nist.gov/projects/trecvid.
Ullah A, Muhammad K, Hussain T, Baik SW, De Albuquerque VHC (2020) Event-oriented 3D convolutional features selection and hash codes generation using PCA for video retrieval. IEEE Access 8:196529–196540
Article Google Scholar
Veltkamp RC, Burkhardt H, Kriegel H-P (2013) State-of-the-art in content-based image and video retrieval. Springer
MATH Google Scholar
Wu X, Ngo CW, Hauptmann AG, Tan H (2009) Real-time near-duplicate elimination for web video search with content and context. IEEE Multimedia 11:196–207
Article Google Scholar
Yu SI, Jiang L, Xu Z, Yang Y, Hauptmann AG (2015) Content-based video search over 1 million videos with 1 core in 1 second. In: ACM on international conference on multimedia retrieval. ACM, New York, pp 419–426. https://doi.org/10.1145/2671188.2749398
Chapter Google Scholar
Zhang H, Wang M, Hong R, Chua T-S (2016) Play and rewind: optimizing binary representations of videos by self-supervised temporal hashing. In: ACM on multimedia conference, pp 781–790. https://doi.org/10.1145/2964284.2964308
Chapter Google Scholar
Zhang J, Sclaroff S, Lin Z, Shen X, Price B, Mech R (2016) Unconstrained salient object detection via proposal subset optimization. In: Computer vision and pattern recognition, pp 5733–5742. https://doi.org/10.1109/CVPR.2016.618
Zhang C, Lin Y, Zhu L, Liu A, Zhang Z, Huang F (2019) CNN-VWII: an efficient approach for large-scale video retrieval by image queries. Pattern Recogn Lett 123:82–88
Article Google Scholar
Zhao G, Zhang M, Li Y, Liu J, Zhang B, Wen JR (2021) Pyramid regional graph representation learning for content-based video retrieval. Inf Process Manag 58(3):102488
Article Google Scholar
Zhu Y, Huang X, Huang Q, Tian Q (2016) Large-scale video copy retrieval with temporal-concentration SIFT. Neurocomputing 187:83–91
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science (SF), St. Mary’s College (Autonomous), Thoothukudi, Tamilnadu, India
S. Sowmyayani
Department of Computer Science and Engineering, Manonmaniam Sundaranar University, Tirunelveli, Tamilnadu, India
P. Arockia Jansi Rani

Authors

S. Sowmyayani
View author publications
You can also search for this author in PubMed Google Scholar
P. Arockia Jansi Rani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. Sowmyayani.

Ethics declarations

Conflict of interest

This work entitled “Content Based Video Retrieval System Using Two Stream Convolutional Neural Network” is not submitted anywhere else. Whole content used in this research is original and not copied. There is no conflict of interest from authors.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sowmyayani, S., Rani, P.A.J. Content based video retrieval system using two stream convolutional neural network. Multimed Tools Appl 82, 24465–24483 (2023). https://doi.org/10.1007/s11042-023-14784-5

Download citation

Received: 04 October 2021
Revised: 11 April 2022
Accepted: 06 February 2023
Published: 27 February 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s11042-023-14784-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Content based video retrieval system using two stream convolutional neural network

Abstract

Access this article

Similar content being viewed by others

Content-Based Video Retrieval Using Deep Learning Algorithms

Content based video retrieval using deep learning feature extraction by modified VGG_16

Exploring the Strengths of Neural Codes for Video Retrieval

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Content based video retrieval system using two stream convolutional neural network

Abstract

Access this article

Similar content being viewed by others

Content-Based Video Retrieval Using Deep Learning Algorithms

Content based video retrieval using deep learning feature extraction by modified VGG_16

Exploring the Strengths of Neural Codes for Video Retrieval

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation