Skip to main content

Keyframes and Shot Boundaries: The Attributes of Scene Segmentation and Classification

  • Conference paper
  • First Online:
Harmony Search and Nature Inspired Optimization Algorithms

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 741))

Abstract

Video analytics of real-life scenario deals with the multimedia data statistics that may be characterized by multimodal features of the video components. Large varieties of low-scale multimodal features of the objects creates many challenging issues for discrimination and analysis. On the other hand occlusion, varied illuminations, and complex environmental conditions highlight the video parsing, a challenging research problem. For the experimental purpose, the vital components of the videos include scenes, shots, keyframes, objects, and background. In this work, we focus on keyframes and shot boundaries for scene segmentation of the sample videos taken from YouTube. Structure Similarity index (SSIM) of the shots is computed from the histograms of LBP and HSV color similarities. Motion similarity and inverse time proximity are added to generate Shot Similarity Graph. Sliding window methods are used for grouping similar shots. The proposed work for scene segmentation is validated on six videos of various semantics characterized by human being and animals. The play of the video ranges from 0.5 to 15 min and total no. of scenes in the videos range from 06 to 33.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Li, L.J., Socher, R., Fei-Fei, L.: Towards total scene understanding: Classification, annotation and segmentation in an automatic framework. In: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on pp. 2036–2043. IEEE, (2009)

    Google Scholar 

  2. Tighe, J., Lazebnik, S.: Superparsing. IJCV 101(2), 329–349 (2013)

    Article  Google Scholar 

  3. Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1915–1929 (2013)

    Article  Google Scholar 

  4. Myeong, H., Mu Lee, K.: Tensor-based high-order semantic relation transfer for semantic scene segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3073–3080, (2013)

    Google Scholar 

  5. Hong, S., Noh, H., Han, B.: Decoupled deep neural network for semisupervised semantic segmentation, in NIPS, pp. 1495–1503, (2015)

    Google Scholar 

  6. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  Google Scholar 

  7. Ioannidis, A., Chasanis, V., Likas, A.: Weighted multi-view key-frame extraction. Pattern Recogn. Lett. 72, 52–61 (2016)

    Article  Google Scholar 

  8. Gygli, M.: Ridiculously fast shot boundary detection with fully convolutional neural networks. arXiv preprint arXiv:1705.08214, (2017)

  9. Hassanien, A., Elgharib, M., Selim, A., Hefeeda, M., Matusik, W.: Large-scale, fast and accurate shot boundary detection through spatio-temporal convolutional neural networks. arXiv preprint arXiv:1705.03281, (2017)

  10. Lee, H., Yu, J., Im, Y., Gil, J.M., Park, D.: A unified scheme of shot boundary detection and anchor shot detection in news video story parsing. Multimedia Tools Appl. 51(3), 1127–1145 (2011)

    Article  Google Scholar 

  11. Mondal, J., Kundu, M.K., Das, S., Chowdhury, M.: Video shot boundary detection using multiscale geometric analysis of nsct and least squares support vector machine. Multimedia Tools Appl. 1–23 (2017)

    Google Scholar 

  12. Mohanta, P.P., Saha, S.K., Chanda, B.: A model-based shot boundary detection technique using frame transition parameters. IEEE Trans. Multimedia 14(1), 223–233 (2012)

    Article  Google Scholar 

  13. Cyganek, B., Woźniak, M.: Tensor-based shot boundary detection in video streams. New Generation Computing 35(4), 311–340 (2017)

    Article  Google Scholar 

  14. Thounaojam, D.M., Bhadouria, V.S., Roy, S., Singh, K.M.: Shot boundary detection using perceptual and semantic information. Int. J. Multimedia Inf. Retrieval 6(2), 167–174 (2017)

    Article  Google Scholar 

  15. Fan, J., Zhou, S., Siddique, M.A.: Fuzzy color distribution chart-based shot boundary detection. Multimedia Tools Appl. 76(7), 10169–10190 (2017)

    Article  Google Scholar 

  16. Lu, Z.M., Shi, Y.: Fast video shot boundary detection based on SVD and pattern matching. IEEE Trans. Image Process. 22(12), 5136–5145 (2013)

    Article  MathSciNet  Google Scholar 

  17. Bae, G., Cho, S.I., Kang, S.J., Kim, Y.H.: Dual-dissimilarity measure-based statistical video cut detection. J. Real-Time Image Process. 1–11 (2017)

    Google Scholar 

  18. Yong, S.P., Deng, J.D., Purvis, M.K.: Wildlife video key-frame extraction based on novelty detection in semantic context. Multimedia Tools and Appl. 62(2), 359–376 (2013)

    Article  Google Scholar 

  19. Dang, C.T., Kumar, M., Radha, H.: Key frame extraction from consumer videos using epitome. In: Image Processing (ICIP), 2012 19th IEEE International Conference on, pp. 93–96. IEEE, (2012)

    Google Scholar 

  20. Gianluigi, C., Raimondo, S.: An innovative algorithm for key frame extraction in video summarization. J. Real-Time Image Process. 1(1), 69–88 (2006)

    Article  Google Scholar 

  21. Fei, M., Jiang, W., & Mao, W.: A novel compact yet rich key frame creation method for compressed video summarization. Multimedia Tools Appl. 1–21 (2017)

    Google Scholar 

  22. Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoderdecoder architecture for scene segmentation. IEEE Trans. Pattern Anal. Machine Intell. (2017)

    Google Scholar 

  23. Neverova, N., Luc, P., Couprie, C., Verbeek, J., LeCun, Y.: Predicting deeper into the future of semantic segmentation. arXiv preprint arXiv:1703.07684, (2017)

  24. Li, L., Qian, B., Lian, J., Zheng, W., Zhou, Y.: Traffic scene segmentation based on RGB-D image and deep learning. IEEE Trans. Intell. Transport. Syst. (2017)

    Google Scholar 

  25. Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Scene parsing with multiscale feature learning, purity trees, and optimal covers. arXiv preprint arXiv:1202.2160, (2012)

  26. J. Long, E. Shelhamer, T. Darrell: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)

    Google Scholar 

  27. Galleguillos, C., Belongie, S.: Context based object categorization: A critical survey. Comput. Vis. Image Underst. 114(6), 712–722 (2010)

    Article  Google Scholar 

  28. Liu, Z., Li, X. Luo, P. Loy, C.-C., Tang, X.: Semantic image segmentation via deep parsing network. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1377–1385, (2015)

    Google Scholar 

  29. Handa, A., Patraucean, V., Badrinarayanan, V. Stent, S., Cipolla, R.: Scenenet: Understanding real world indoor scenes with synthetic data. In: CVPR, (2016)

    Google Scholar 

  30. Song, S., Lichtenberg, S.P., Xiao, J.: Sun rgb-d: a rgb-d scene understanding benchmark suite,: In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 567–576, (2015)

    Google Scholar 

  31. H. Noh, S. Hong, B. Han: Learning deconvolution network for semantic segmentation. In: ICCV, pp. 1520–1528 (2015)

    Google Scholar 

  32. Moscheni, F., Bhattacharjee, S., Kunt, M.: Spatiotemporal segmentation based on region merging. IEEE Trans. Pattern Anal. Mach. Intell. 20(9), 897–915 (1998)

    Article  Google Scholar 

  33. C. Zhang, L. Wang, and R. Yang: Semantic segmentation of urban scenes using dense depth maps. In: ECCV, pp. 708–721, Springer, (2010)

    Google Scholar 

  34. Caelles, S., Maninis, K.K., Pont-Tuset, J., Leal-Taixé, L., Cremers, D., Van Gool, L.: One-shot video object segmentation. In: CVPR, IEEE, (2017)

    Google Scholar 

  35. Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2011)

    Article  Google Scholar 

  36. Zhu, S., Liu, Y.: Video scene segmentation and semantic representation using a novel scheme. Multimedia Tools and Appl. 42(2), 183–205 (2009)

    Article  Google Scholar 

  37. Hariharan, B., Arbelaez, P., Girshick, R., Malik, J.: Hypercolumns for object segmentation and fine-grained localization. In CVPR, pp. 447–456, (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to N. Kumar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kumar, N., Sukavanam, N. (2019). Keyframes and Shot Boundaries: The Attributes of Scene Segmentation and Classification. In: Yadav, N., Yadav, A., Bansal, J., Deep, K., Kim, J. (eds) Harmony Search and Nature Inspired Optimization Algorithms. Advances in Intelligent Systems and Computing, vol 741. Springer, Singapore. https://doi.org/10.1007/978-981-13-0761-4_74

Download citation

Publish with us

Policies and ethics