Skip to main content
Log in

A novel method for video shot boundary detection using CNN-LSTM approach

  • Regular Paper
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

Due to the rapid growth of digital videos and the massive increase in video content, there is an urgent need to develop efficient automatic video content analysis mechanisms for different tasks, namely summarization, retrieval, and classification. In all these applications, one needs to identify shot boundary detection. This paper proposes a novel dual-stage approach for cut transition detection that can withstand certain illumination and motion effects. Firstly, we present a deep neural network model using the pre-trained model combined with long short-term memory LSTM network and the euclidean distance metric. Two parallel pre-trained models sharing the same weights extract the spatial features. Then, these features are fed to the LSTM and the euclidean distance metric to classify the frames into specific categories (similar or not similar). To train the model, we generated a new database containing 5000 frame pairs with two labels (similar, dissimilar) for training and 1000 frame pairs for testing from online videos. Secondly, we adopt the segment selection process to predict the shot boundaries. This preprocessing method can help improve the accuracy and speed of the VSBD algorithm. Then, cut transition detection based on the similarity model is conducted to identify the shot boundaries in the candidate segments. Experimental results on standard databases TRECVid 2001, 2007, and RAI show that the proposed approach achieves better detection rates over the state-of-the-art SBD methods in terms of the F1 score criterion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data Availability

Data are available from the authors upon request.

References

  1. Halim BA, Faiza T, Seridi H et al (eds) Shot boundary detection: fundamental concepts and survey. In: Seridi H et al (eds) The 1st international conference on innovative trends in computer science, CITSC 2019, Guelma, Algeria, November 20–21, 2019, CEUR workshop proceedings (CEUR-WS.org, 2019), vol 2589, pp 34–40. http://ceur-ws.org/Vol-2589/Paper6.pdf

  2. Pal G et al (2015) Video shot boundary detection: a review. Springer, pp 119–127

  3. Mondal J, Kundu MK, Das S, Chowdhury M (2017) Video shot boundary detection using multiscale geometric analysis of nsct and least squares support vector machine. Multimedia Tools Appl 77(7):8139–8161. https://doi.org/10.1007/s11042-017-4707-9

    Article  Google Scholar 

  4. Cotsaces C, Nikolaidis N, Pitas I (2006) Video shot detection and condensed representation. A review. IEEE Signal Process Mag 23(2):28–37. https://doi.org/10.1109/msp.2006.1621446

    Article  Google Scholar 

  5. Lu Z-M, Shi Y (2013) Fast video shot boundary detection based on SVD and pattern matching. IEEE Trans Image Process 22(12):5136–5145. https://doi.org/10.1109/tip.2013.2282081

    Article  MathSciNet  Google Scholar 

  6. Cernekova Z, Nikou C, Pitas I (2002) Shot detection in video sequences using entropy based metrics. IEEE

  7. Baber J, Afzulpurkar N, Dailey MN, Bakhtyar M (2011) Shot boundary detection from videos using entropy and local descriptor. IEEE

  8. Yadav RB, Nishchal NK, Gupta AK, Rastogi VK (2007) Retrieval and classification of shape-based objects using Fourier, generic Fourier, and wavelet-Fourier descriptors technique: a comparative study. Opt Lasers Eng 45(6):695–708. https://doi.org/10.1016/j.optlaseng.2006.11.001

    Article  Google Scholar 

  9. Zheng J, Zou F, Shi M (2004) An efficient algorithm for video shot boundary detection. IEEE

  10. Bruno E, Pellerin D (2002) Video shot detection based on linear prediction of motion, vol 1, pp 289–292. IEEE

  11. Bhaumik H, Bhattacharyya S, Chakraborty S (2019) A vague set approach for identifying shot transition in videos using multiple feature amalgamation. Appl Soft Comput 75:633–651. https://doi.org/10.1016/j.asoc.2018.10.053

    Article  Google Scholar 

  12. Bendraou Y (2017) Video shot boundary detection and key-frame extraction using mathematical models. Theses, Université du Littoral Côte d’Opale. https://tel.archives-ouvertes.fr/tel-01718400

  13. Kikukawa S, Kawafuchi T (1992) Development of an automatic summary editing system for the audio-visual resources. Trans IEICE J75–A(2):204–212

    Google Scholar 

  14. Sun J, Wan Y (2014) A novel metric for efficient video shot boundary detection. IEEE

  15. Chakraborty S, Singh A, Thounaojam DM (2021) A novel bifold-stage shot boundary detection algorithm: invariant to motion and illumination. Vis Comput 38(2):445–456. https://doi.org/10.1007/s00371-020-02027-9

    Article  Google Scholar 

  16. Swanberg D, Shu C-F, Jain RC, Niblack CW (ed.) (1993) Knowledge guided parsing in video databases. In: Niblack CW (ed) Storage and retrieval for image and video databases. SPIE

  17. Li Z, Liu X, Zhang S (2016) Shot boundary detection based on multilevel difference of colour histograms. IEEE

  18. Shao H, Qu Y, Cui W (2015) Shot boundary detection algorithm based on HSV histogram and HOG feature. Atlantis Press, pp 951–957

  19. Chakraborty S, Thounaojam DM, Sinha N (2020) A shot boundary detection technique based on visual colour information. Multimedia Tools Appl 80(3):4007–4022. https://doi.org/10.1007/s11042-020-09857-8

    Article  Google Scholar 

  20. Rashmi BS, Nagendraswamy HS (2020) Video shot boundary detection using block based cumulative approach. Multimedia Tools Appl 80(1):641–664. https://doi.org/10.1007/s11042-020-09697-6

    Article  Google Scholar 

  21. Zabih R, Miller J, Mai K (1999) A feature-based algorithm for detecting and classifying production effects. Multimedia Syst 7(2):119–128. https://doi.org/10.1007/s005300050115

    Article  Google Scholar 

  22. Zabih RJM, Mai K (1995) A feature-based algorithm for detecting and classifying scene breaks. ACM, San Francisco, pp 189–200

    Google Scholar 

  23. Singh A, Thounaojam DM, Chakraborty S (2019) A novel automatic shot boundary detection algorithm: robust to illumination and motion effect. Signal Image Video Process 14(4):645–653. https://doi.org/10.1007/s11760-019-01593-3

    Article  Google Scholar 

  24. Mishra R (2021) Video shot boundary detection using hybrid dual tree complex wavelet transform with Walsh Hadamard transform. Multimedia Tools Appl 80(18):28109–28135. https://doi.org/10.1007/s11042-021-11052-2

    Article  Google Scholar 

  25. Zhang H, Kankanhalli A, Smoliar SW (1993) Automatic partitioning of full-motion video. Multimedia Syst 1(1):10–28. https://doi.org/10.1007/bf01210504

    Article  Google Scholar 

  26. Shahraray B, Rodriguez AA, Safranek RJ, Delp EJ (eds) (1995) Scene change detection and content-based sampling of video sequences. In: Rodriguez AA, Safranek RJ, Delp EJ (eds) Digital video compression: algorithms and technologies 1995, SPIE

  27. Thounaojam DM, Khelchandra T, Singh KM, Roy S (2016) A genetic algorithm and fuzzy logic approach for video shot boundary detection. Comput Intell Neurosci 2016:1–11. https://doi.org/10.1155/2016/8469428

    Article  Google Scholar 

  28. Kundu A, Das D, Bandyopadhyay S (2013) Scene boundary detection from movie dialogue: a genetic algorithm approach. Polibits 47:55–60. https://doi.org/10.17562/pb-47-6

    Article  Google Scholar 

  29. Bi J, Liu X, Lang B (2011) A novel shot boundary detection based on information theory using SVM. IEEE

  30. Singh A, Singh TD, Bandyopadhyay S (2022) V2t: video to text framework using a novel automatic shot boundary detection algorithm. Multimedia Tools Appl 81(13):17989–18009. https://doi.org/10.1007/s11042-022-12343-y

    Article  Google Scholar 

  31. Nishani E, Cico B (2017) Computer vision approaches based on deep learning and neural networks: deep neural networks for video analysis of human pose estimation. IEEE

  32. Xu J, Song L, Xie R (2016) Shot boundary detection using convolutional neural networks. IEEE

  33. Hassanien A, Elgharib MA, Selim A, Hefeeda M, Matusik W (2017) Large-scale, fast and accurate shot boundary detection through spatio-temporal convolutional neural networks. CoRR. arXiv:1705.03281

  34. Liang R, Zhu Q, Wei H, Liao S (2017) A video shot boundary detection approach based on CNN feature. IEEE.

  35. Zhuang F et al (2021) A comprehensive survey on transfer learning. Proc IEEE 109(1):43–76. https://doi.org/10.1109/jproc.2020.3004555

  36. Melekhov I, Kannala J, Rahtu E (2016) Siamese network features for image matching. IEEE

  37. Sherstinsky A (2020) Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D Nonlinear Phenomena 404:132306. https://doi.org/10.1016/j.physd.2019.132306

    Article  MathSciNet  MATH  Google Scholar 

  38. Phil M (2018) Illustrated guide to LSTM’s and GRU’s: a step by step explanation. https://towardsdatascience.com/

  39. Wang L, Rajan D (2020) An image similarity descriptor for classification tasks. J Vis Commun Image Represent 71:102847. https://doi.org/10.1016/j.jvcir.2020.102847

    Article  Google Scholar 

  40. Tippaya S, Sitjongsataporn S, Tan T, Chamnongthai K, Khan M (2015) Video shot boundary detection based on candidate segment selection and transition pattern analysis. IEEE

  41. Tippaya S, Sitjongsataporn S, Tan T, Khan MM, Chamnongthai K (2017) Multi-modal visual features-based video shot boundary detection. IEEE Access 5:12563–12575. https://doi.org/10.1109/access.2017.2717998

    Article  Google Scholar 

  42. Baraldi L, Grana C, Cucchiara R (2015) Shot and scene detection via hierarchical clustering for re-using broadcast video. Springer, pp 801–811

  43. Souček T, Moravec J, Lokoč J (2019) Transnet: a deep network for fast detection of common shot transitions. CoRR. arXiv:1906.03363

  44. Gygli M (2018) Ridiculously fast shot boundary detection with fully convolutional neural networks. IEEE

  45. Chakraborty S, Thounaojam DM (2019) A novel shot boundary detection system using hybrid optimization technique. Appl Intell 49(9):3207–3220. https://doi.org/10.1007/s10489-019-01444-1

    Article  Google Scholar 

  46. Kar T, Kanungo P (2017) A motion and illumination resilient framework for automatic shot boundary detection. Signal Image Video Process 11(7):1237–1244. https://doi.org/10.1007/s11760-017-1080-0

    Article  Google Scholar 

  47. Lakshmi Priya G, Domnic S (2014) Walsh–Hadamard transform kernel-based feature vector for shot boundary detection. IEEE Trans Image Process 23(12):5187–5197. https://doi.org/10.1109/tip.2014.2362652

    Article  MathSciNet  MATH  Google Scholar 

Download references

Funding

There is no funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdelhalim Benoughidene.

Ethics declarations

Conflict of interest

Both the authors have checked the manuscript and have agreed to the submission in International Journal of Multimedia Information Retrieval. There is no conflict of interest between the authors.

Ethics approval

This paper contains no cases of studies with human participants performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Abdelhalim Benoughidene and Faiza Titouna have contributed equally to this work.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Benoughidene, A., Titouna, F. A novel method for video shot boundary detection using CNN-LSTM approach. Int J Multimed Info Retr 11, 653–667 (2022). https://doi.org/10.1007/s13735-022-00251-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13735-022-00251-8

Keywords

Navigation