Abstract
Due to the rapid growth of digital videos and the massive increase in video content, there is an urgent need to develop efficient automatic video content analysis mechanisms for different tasks, namely summarization, retrieval, and classification. In all these applications, one needs to identify shot boundary detection. This paper proposes a novel dual-stage approach for cut transition detection that can withstand certain illumination and motion effects. Firstly, we present a deep neural network model using the pre-trained model combined with long short-term memory LSTM network and the euclidean distance metric. Two parallel pre-trained models sharing the same weights extract the spatial features. Then, these features are fed to the LSTM and the euclidean distance metric to classify the frames into specific categories (similar or not similar). To train the model, we generated a new database containing 5000 frame pairs with two labels (similar, dissimilar) for training and 1000 frame pairs for testing from online videos. Secondly, we adopt the segment selection process to predict the shot boundaries. This preprocessing method can help improve the accuracy and speed of the VSBD algorithm. Then, cut transition detection based on the similarity model is conducted to identify the shot boundaries in the candidate segments. Experimental results on standard databases TRECVid 2001, 2007, and RAI show that the proposed approach achieves better detection rates over the state-of-the-art SBD methods in terms of the F1 score criterion.
Similar content being viewed by others
Data Availability
Data are available from the authors upon request.
References
Halim BA, Faiza T, Seridi H et al (eds) Shot boundary detection: fundamental concepts and survey. In: Seridi H et al (eds) The 1st international conference on innovative trends in computer science, CITSC 2019, Guelma, Algeria, November 20–21, 2019, CEUR workshop proceedings (CEUR-WS.org, 2019), vol 2589, pp 34–40. http://ceur-ws.org/Vol-2589/Paper6.pdf
Pal G et al (2015) Video shot boundary detection: a review. Springer, pp 119–127
Mondal J, Kundu MK, Das S, Chowdhury M (2017) Video shot boundary detection using multiscale geometric analysis of nsct and least squares support vector machine. Multimedia Tools Appl 77(7):8139–8161. https://doi.org/10.1007/s11042-017-4707-9
Cotsaces C, Nikolaidis N, Pitas I (2006) Video shot detection and condensed representation. A review. IEEE Signal Process Mag 23(2):28–37. https://doi.org/10.1109/msp.2006.1621446
Lu Z-M, Shi Y (2013) Fast video shot boundary detection based on SVD and pattern matching. IEEE Trans Image Process 22(12):5136–5145. https://doi.org/10.1109/tip.2013.2282081
Cernekova Z, Nikou C, Pitas I (2002) Shot detection in video sequences using entropy based metrics. IEEE
Baber J, Afzulpurkar N, Dailey MN, Bakhtyar M (2011) Shot boundary detection from videos using entropy and local descriptor. IEEE
Yadav RB, Nishchal NK, Gupta AK, Rastogi VK (2007) Retrieval and classification of shape-based objects using Fourier, generic Fourier, and wavelet-Fourier descriptors technique: a comparative study. Opt Lasers Eng 45(6):695–708. https://doi.org/10.1016/j.optlaseng.2006.11.001
Zheng J, Zou F, Shi M (2004) An efficient algorithm for video shot boundary detection. IEEE
Bruno E, Pellerin D (2002) Video shot detection based on linear prediction of motion, vol 1, pp 289–292. IEEE
Bhaumik H, Bhattacharyya S, Chakraborty S (2019) A vague set approach for identifying shot transition in videos using multiple feature amalgamation. Appl Soft Comput 75:633–651. https://doi.org/10.1016/j.asoc.2018.10.053
Bendraou Y (2017) Video shot boundary detection and key-frame extraction using mathematical models. Theses, Université du Littoral Côte d’Opale. https://tel.archives-ouvertes.fr/tel-01718400
Kikukawa S, Kawafuchi T (1992) Development of an automatic summary editing system for the audio-visual resources. Trans IEICE J75–A(2):204–212
Sun J, Wan Y (2014) A novel metric for efficient video shot boundary detection. IEEE
Chakraborty S, Singh A, Thounaojam DM (2021) A novel bifold-stage shot boundary detection algorithm: invariant to motion and illumination. Vis Comput 38(2):445–456. https://doi.org/10.1007/s00371-020-02027-9
Swanberg D, Shu C-F, Jain RC, Niblack CW (ed.) (1993) Knowledge guided parsing in video databases. In: Niblack CW (ed) Storage and retrieval for image and video databases. SPIE
Li Z, Liu X, Zhang S (2016) Shot boundary detection based on multilevel difference of colour histograms. IEEE
Shao H, Qu Y, Cui W (2015) Shot boundary detection algorithm based on HSV histogram and HOG feature. Atlantis Press, pp 951–957
Chakraborty S, Thounaojam DM, Sinha N (2020) A shot boundary detection technique based on visual colour information. Multimedia Tools Appl 80(3):4007–4022. https://doi.org/10.1007/s11042-020-09857-8
Rashmi BS, Nagendraswamy HS (2020) Video shot boundary detection using block based cumulative approach. Multimedia Tools Appl 80(1):641–664. https://doi.org/10.1007/s11042-020-09697-6
Zabih R, Miller J, Mai K (1999) A feature-based algorithm for detecting and classifying production effects. Multimedia Syst 7(2):119–128. https://doi.org/10.1007/s005300050115
Zabih RJM, Mai K (1995) A feature-based algorithm for detecting and classifying scene breaks. ACM, San Francisco, pp 189–200
Singh A, Thounaojam DM, Chakraborty S (2019) A novel automatic shot boundary detection algorithm: robust to illumination and motion effect. Signal Image Video Process 14(4):645–653. https://doi.org/10.1007/s11760-019-01593-3
Mishra R (2021) Video shot boundary detection using hybrid dual tree complex wavelet transform with Walsh Hadamard transform. Multimedia Tools Appl 80(18):28109–28135. https://doi.org/10.1007/s11042-021-11052-2
Zhang H, Kankanhalli A, Smoliar SW (1993) Automatic partitioning of full-motion video. Multimedia Syst 1(1):10–28. https://doi.org/10.1007/bf01210504
Shahraray B, Rodriguez AA, Safranek RJ, Delp EJ (eds) (1995) Scene change detection and content-based sampling of video sequences. In: Rodriguez AA, Safranek RJ, Delp EJ (eds) Digital video compression: algorithms and technologies 1995, SPIE
Thounaojam DM, Khelchandra T, Singh KM, Roy S (2016) A genetic algorithm and fuzzy logic approach for video shot boundary detection. Comput Intell Neurosci 2016:1–11. https://doi.org/10.1155/2016/8469428
Kundu A, Das D, Bandyopadhyay S (2013) Scene boundary detection from movie dialogue: a genetic algorithm approach. Polibits 47:55–60. https://doi.org/10.17562/pb-47-6
Bi J, Liu X, Lang B (2011) A novel shot boundary detection based on information theory using SVM. IEEE
Singh A, Singh TD, Bandyopadhyay S (2022) V2t: video to text framework using a novel automatic shot boundary detection algorithm. Multimedia Tools Appl 81(13):17989–18009. https://doi.org/10.1007/s11042-022-12343-y
Nishani E, Cico B (2017) Computer vision approaches based on deep learning and neural networks: deep neural networks for video analysis of human pose estimation. IEEE
Xu J, Song L, Xie R (2016) Shot boundary detection using convolutional neural networks. IEEE
Hassanien A, Elgharib MA, Selim A, Hefeeda M, Matusik W (2017) Large-scale, fast and accurate shot boundary detection through spatio-temporal convolutional neural networks. CoRR. arXiv:1705.03281
Liang R, Zhu Q, Wei H, Liao S (2017) A video shot boundary detection approach based on CNN feature. IEEE.
Zhuang F et al (2021) A comprehensive survey on transfer learning. Proc IEEE 109(1):43–76. https://doi.org/10.1109/jproc.2020.3004555
Melekhov I, Kannala J, Rahtu E (2016) Siamese network features for image matching. IEEE
Sherstinsky A (2020) Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D Nonlinear Phenomena 404:132306. https://doi.org/10.1016/j.physd.2019.132306
Phil M (2018) Illustrated guide to LSTM’s and GRU’s: a step by step explanation. https://towardsdatascience.com/
Wang L, Rajan D (2020) An image similarity descriptor for classification tasks. J Vis Commun Image Represent 71:102847. https://doi.org/10.1016/j.jvcir.2020.102847
Tippaya S, Sitjongsataporn S, Tan T, Chamnongthai K, Khan M (2015) Video shot boundary detection based on candidate segment selection and transition pattern analysis. IEEE
Tippaya S, Sitjongsataporn S, Tan T, Khan MM, Chamnongthai K (2017) Multi-modal visual features-based video shot boundary detection. IEEE Access 5:12563–12575. https://doi.org/10.1109/access.2017.2717998
Baraldi L, Grana C, Cucchiara R (2015) Shot and scene detection via hierarchical clustering for re-using broadcast video. Springer, pp 801–811
Souček T, Moravec J, Lokoč J (2019) Transnet: a deep network for fast detection of common shot transitions. CoRR. arXiv:1906.03363
Gygli M (2018) Ridiculously fast shot boundary detection with fully convolutional neural networks. IEEE
Chakraborty S, Thounaojam DM (2019) A novel shot boundary detection system using hybrid optimization technique. Appl Intell 49(9):3207–3220. https://doi.org/10.1007/s10489-019-01444-1
Kar T, Kanungo P (2017) A motion and illumination resilient framework for automatic shot boundary detection. Signal Image Video Process 11(7):1237–1244. https://doi.org/10.1007/s11760-017-1080-0
Lakshmi Priya G, Domnic S (2014) Walsh–Hadamard transform kernel-based feature vector for shot boundary detection. IEEE Trans Image Process 23(12):5187–5197. https://doi.org/10.1109/tip.2014.2362652
Funding
There is no funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Both the authors have checked the manuscript and have agreed to the submission in International Journal of Multimedia Information Retrieval. There is no conflict of interest between the authors.
Ethics approval
This paper contains no cases of studies with human participants performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Abdelhalim Benoughidene and Faiza Titouna have contributed equally to this work.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Benoughidene, A., Titouna, F. A novel method for video shot boundary detection using CNN-LSTM approach. Int J Multimed Info Retr 11, 653–667 (2022). https://doi.org/10.1007/s13735-022-00251-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13735-022-00251-8