Skip to main content
Log in

Particle swarm optimized deep spatio-temporal features for efficient video retrieval

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

Abstract

In content-based video retrieval, the phases of video frame selection and 3-dimensional feature extraction are especially crucial. These stages should be optimized based on temporal complexity because increasing the execution time would also increase the retrieval time. Furthermore, information optimization is not taken into consideration by fixed keyframe sampling methods such as clustering, uniform sampling, and shot boundary-based, which might result in information redundancy in the selected keyframes in the video data. To address these problems, a deep spatiotemporal feature extraction method based on particle swarm optimization has been proposed for efficient video retrieval. The results demonstrate that the approach outperforms several benchmark sampling methodologies. The optimization is based on the grayscale histogram. Deep features may potentially serve as the foundation for the next iterations of this.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Fig. 1
Fig. 2

Similar content being viewed by others

Data availability

The datasets analysed during the current study are available in the repository with weblink [https://www.crcv.ucf.edu/data/UCF50.rar].

References

  1. Roselinkiruba R, Saranya Jothi C, Tamil Thendral M, Hemalatha R (2023) Secure video steganography using key frame and region selection technique. Int J Inf Technol 15(3):1299–1308. https://doi.org/10.1007/s41870-023-01180-3

    Article  Google Scholar 

  2. Das S, Banerjee M, Chaudhuri A (2017) An improved video key-frame extraction algorithm leads to video watermarking. Int J Inf Technol 10(1):21–34. https://doi.org/10.1007/s41870-017-0054-3

    Article  Google Scholar 

  3. Gawande U, Hajari K, Golhar Y (2020) Deep learning approach to key frame detection in human action videos. In: Recent trends in computational intelligence. IntechOpen. https://doi.org/10.5772/intechopen.91188

  4. Chau W-S, Au OC, Chan T-W, Chong T-S (2005) Optimal key frame selection using visual content metric. In: Proceedings of the 2005 international conference on communications, circuits and systems. IEEE. https://doi.org/10.1109/icccas.2005.1493469

  5. Zhang X-D, Liu T-Y, Lo K-T, Feng J (2003) Dynamic selection and effective compression of key frames for video abstraction. Pattern Recogn Lett 24(9–10):1523–1532. https://doi.org/10.1016/s0167-8655(02)00391-4

    Article  CAS  ADS  Google Scholar 

  6. Liu H, Meng W, Liu Z, (2012) Key frame extraction of online video based on optimized frame difference. In: 2012 9th international conference on fuzzy systems and knowledge discovery. IEEE. https://doi.org/10.1109/fskd.2012.6233777

  7. Fayk MB, Nemr HAE, Moussa MM (2010) Particle swarm optimization based video abstraction. J Adv Res 1(2):163–167. https://doi.org/10.1016/j.jare.2010.03.009

    Article  Google Scholar 

  8. Tang H, Ding L, Wu S, Ren B, Sebe N, Rota P (2023) Deep unsupervised key frame extraction for efficient video classification. ACM Trans Multimed Comput Commun Appl 19(3):1–17. https://doi.org/10.1145/3571735

    Article  CAS  Google Scholar 

  9. Avrithis YS, Doulamis AD, Doulamis ND, Kollias SD (1999) A stochastic framework for optimal key frame extraction from MPEG video databases. Comput Vis Image Understand 75(1–2):3–24. https://doi.org/10.1006/cviu.1999.0761

    Article  Google Scholar 

  10. Kızıltepe RS, Gan JQ, Escobar JJ (2021) A novel keyframe extraction method for video classification using deep neural networks. Neural Comput Appl. https://doi.org/10.1007/s00521-021-06322-x

    Article  Google Scholar 

  11. Radarapu R, Gopal ASS, Madhusudhan NH, Anand Kumar M (2021) Video summarization and captioning using dynamic mode decomposition for surveillance. Int J Inf Technol 13(5):1927–1936. https://doi.org/10.1007/s41870-021-00668-0

    Article  Google Scholar 

  12. Eruvaram P, Ramani K, Bindu CS (2018) An experimental comparative study on slide change detection in lecture videos. Int J Inf Technol 12(2):429–436. https://doi.org/10.1007/s41870-018-0210-4

    Article  Google Scholar 

  13. Pal G, Rudrapaul D, Acharjee S, Ray R, Chakraborty S, Dey N (2015) Video shot boundary detection: a review. In: Advances in intelligent systems and computing. Springer, Berlin, pp 119–127. https://doi.org/10.1007/978-3-319-13731-5_14

  14. Choi J, Wang Z, Lee S-C, Jeon WJ (2013) A spatio-temporal pyramid matching for video retrieval. Comput Vis Image Understand 117(6):660–669. https://doi.org/10.1016/j.cviu.2013.02.003

    Article  Google Scholar 

  15. Thakre KS, Rajurkar AM, Manthalkar RR (2016) Video partitioning and secured keyframe extraction of MPEG video. Proc Comput Sci 78:790–798. https://doi.org/10.1016/j.procs.2016.02.058

    Article  Google Scholar 

  16. Shukla D, Sharma M (2018) A novel video scene change detection using successive estimation of statistical measure and HiBiSLI method. Int J Inf Technol 11(1):47–54. https://doi.org/10.1007/s41870-018-0146-8

    Article  Google Scholar 

  17. Yoon H, Han J-H (2022) Content-based video retrieval with prototypes of deep features. IEEE Access 10:30730–30742. https://doi.org/10.1109/access.2022.3160214

    Article  Google Scholar 

  18. Mizher MAA, Ang MC, Abdullah SNHS, Ng KW (2017) Action key frames extraction using l1-norm and accumulative optical flow for compact video shot summarisation. In: Advances in visual informatics. Springer, Berlin, pp 364–375. https://doi.org/10.1007/978-3-319-70010-6_34

  19. Tan L, Song Y, Ma Z, Lv X, Dong X (2020) Deep learning video action recognition method based on key frame algorithm. In: Lecture notes in computer science. Springer, Berlin, pp 62–73. https://doi.org/10.1007/978-3-030-57884-8_6

  20. Jadon S, Jasim M (2020) Unsupervised video summarization framework using keyframe extraction and video skimming. In: 2020 IEEE 5th international conference on computing communication and automation (ICCCA). IEEE. https://doi.org/10.1109/iccca49541.2020.9250764

  21. Banerjee A, Kumar E, Ravinder M (2023) Conditional deep clustering based transformed spatio-temporal features and fused distance for efficient video retrieval. Int J Inf Technol 15(5):2349–2355. https://doi.org/10.1007/s41870-023-01327-2

    Article  Google Scholar 

  22. Garg D, Dahiya T, Shrivastava AK (2022) Developing a new heuristic algorithm for efficient reliability optimization. Int J Inf Technol 14(5):2505–2511. https://doi.org/10.1007/s41870-022-00975-0

    Article  Google Scholar 

  23. Banerjee A, Kumar E, Ravinder M (2022) Transformed deep spatio temporal-features with fused distance for efficient video retrieval. In: 2022 4th international conference on Artificial Intelligence and Speech Technology (AIST), pp 1–5. https://doi.org/10.1109/AIST55798.2022.10064821

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alina Banerjee.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Human and animal rights

The authors did not receive support from any organization for the submitted work. This article does not contain any studies involving human participants performed by any of the authors. This article also does not contain any studies involving animals performed by any of the authors.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Banerjee, A., Kumar, E. & Ravinder, M. Particle swarm optimized deep spatio-temporal features for efficient video retrieval. Int. j. inf. tecnol. 16, 1763–1768 (2024). https://doi.org/10.1007/s41870-024-01733-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41870-024-01733-0

Keywords

Navigation