Keyframes and Shot Boundaries: The Attributes of Scene Segmentation and Classification

Kumar, N.; Sukavanam, N.

doi:10.1007/978-981-13-0761-4_74

N. Kumar¹⁹ &
N. Sukavanam¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 741))

1798 Accesses
5 Citations

Abstract

Video analytics of real-life scenario deals with the multimedia data statistics that may be characterized by multimodal features of the video components. Large varieties of low-scale multimodal features of the objects creates many challenging issues for discrimination and analysis. On the other hand occlusion, varied illuminations, and complex environmental conditions highlight the video parsing, a challenging research problem. For the experimental purpose, the vital components of the videos include scenes, shots, keyframes, objects, and background. In this work, we focus on keyframes and shot boundaries for scene segmentation of the sample videos taken from YouTube. Structure Similarity index (SSIM) of the shots is computed from the histograms of LBP and HSV color similarities. Motion similarity and inverse time proximity are added to generate Shot Similarity Graph. Sliding window methods are used for grouping similar shots. The proposed work for scene segmentation is validated on six videos of various semantics characterized by human being and animals. The play of the video ranges from 0.5 to 15 min and total no. of scenes in the videos range from 06 to 33.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Li, L.J., Socher, R., Fei-Fei, L.: Towards total scene understanding: Classification, annotation and segmentation in an automatic framework. In: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on pp. 2036–2043. IEEE, (2009)
Google Scholar
Tighe, J., Lazebnik, S.: Superparsing. IJCV 101(2), 329–349 (2013)
Article Google Scholar
Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1915–1929 (2013)
Article Google Scholar
Myeong, H., Mu Lee, K.: Tensor-based high-order semantic relation transfer for semantic scene segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3073–3080, (2013)
Google Scholar
Hong, S., Noh, H., Han, B.: Decoupled deep neural network for semisupervised semantic segmentation, in NIPS, pp. 1495–1503, (2015)
Google Scholar
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Article Google Scholar
Ioannidis, A., Chasanis, V., Likas, A.: Weighted multi-view key-frame extraction. Pattern Recogn. Lett. 72, 52–61 (2016)
Article Google Scholar
Gygli, M.: Ridiculously fast shot boundary detection with fully convolutional neural networks. arXiv preprint arXiv:1705.08214, (2017)
Hassanien, A., Elgharib, M., Selim, A., Hefeeda, M., Matusik, W.: Large-scale, fast and accurate shot boundary detection through spatio-temporal convolutional neural networks. arXiv preprint arXiv:1705.03281, (2017)
Lee, H., Yu, J., Im, Y., Gil, J.M., Park, D.: A unified scheme of shot boundary detection and anchor shot detection in news video story parsing. Multimedia Tools Appl. 51(3), 1127–1145 (2011)
Article Google Scholar
Mondal, J., Kundu, M.K., Das, S., Chowdhury, M.: Video shot boundary detection using multiscale geometric analysis of nsct and least squares support vector machine. Multimedia Tools Appl. 1–23 (2017)
Google Scholar
Mohanta, P.P., Saha, S.K., Chanda, B.: A model-based shot boundary detection technique using frame transition parameters. IEEE Trans. Multimedia 14(1), 223–233 (2012)
Article Google Scholar
Cyganek, B., Woźniak, M.: Tensor-based shot boundary detection in video streams. New Generation Computing 35(4), 311–340 (2017)
Article Google Scholar
Thounaojam, D.M., Bhadouria, V.S., Roy, S., Singh, K.M.: Shot boundary detection using perceptual and semantic information. Int. J. Multimedia Inf. Retrieval 6(2), 167–174 (2017)
Article Google Scholar
Fan, J., Zhou, S., Siddique, M.A.: Fuzzy color distribution chart-based shot boundary detection. Multimedia Tools Appl. 76(7), 10169–10190 (2017)
Article Google Scholar
Lu, Z.M., Shi, Y.: Fast video shot boundary detection based on SVD and pattern matching. IEEE Trans. Image Process. 22(12), 5136–5145 (2013)
Article MathSciNet Google Scholar
Bae, G., Cho, S.I., Kang, S.J., Kim, Y.H.: Dual-dissimilarity measure-based statistical video cut detection. J. Real-Time Image Process. 1–11 (2017)
Google Scholar
Yong, S.P., Deng, J.D., Purvis, M.K.: Wildlife video key-frame extraction based on novelty detection in semantic context. Multimedia Tools and Appl. 62(2), 359–376 (2013)
Article Google Scholar
Dang, C.T., Kumar, M., Radha, H.: Key frame extraction from consumer videos using epitome. In: Image Processing (ICIP), 2012 19th IEEE International Conference on, pp. 93–96. IEEE, (2012)
Google Scholar
Gianluigi, C., Raimondo, S.: An innovative algorithm for key frame extraction in video summarization. J. Real-Time Image Process. 1(1), 69–88 (2006)
Article Google Scholar
Fei, M., Jiang, W., & Mao, W.: A novel compact yet rich key frame creation method for compressed video summarization. Multimedia Tools Appl. 1–21 (2017)
Google Scholar
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoderdecoder architecture for scene segmentation. IEEE Trans. Pattern Anal. Machine Intell. (2017)
Google Scholar
Neverova, N., Luc, P., Couprie, C., Verbeek, J., LeCun, Y.: Predicting deeper into the future of semantic segmentation. arXiv preprint arXiv:1703.07684, (2017)
Li, L., Qian, B., Lian, J., Zheng, W., Zhou, Y.: Traffic scene segmentation based on RGB-D image and deep learning. IEEE Trans. Intell. Transport. Syst. (2017)
Google Scholar
Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Scene parsing with multiscale feature learning, purity trees, and optimal covers. arXiv preprint arXiv:1202.2160, (2012)
J. Long, E. Shelhamer, T. Darrell: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
Google Scholar
Galleguillos, C., Belongie, S.: Context based object categorization: A critical survey. Comput. Vis. Image Underst. 114(6), 712–722 (2010)
Article Google Scholar
Liu, Z., Li, X. Luo, P. Loy, C.-C., Tang, X.: Semantic image segmentation via deep parsing network. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1377–1385, (2015)
Google Scholar
Handa, A., Patraucean, V., Badrinarayanan, V. Stent, S., Cipolla, R.: Scenenet: Understanding real world indoor scenes with synthetic data. In: CVPR, (2016)
Google Scholar
Song, S., Lichtenberg, S.P., Xiao, J.: Sun rgb-d: a rgb-d scene understanding benchmark suite,: In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 567–576, (2015)
Google Scholar
H. Noh, S. Hong, B. Han: Learning deconvolution network for semantic segmentation. In: ICCV, pp. 1520–1528 (2015)
Google Scholar
Moscheni, F., Bhattacharjee, S., Kunt, M.: Spatiotemporal segmentation based on region merging. IEEE Trans. Pattern Anal. Mach. Intell. 20(9), 897–915 (1998)
Article Google Scholar
C. Zhang, L. Wang, and R. Yang: Semantic segmentation of urban scenes using dense depth maps. In: ECCV, pp. 708–721, Springer, (2010)
Google Scholar
Caelles, S., Maninis, K.K., Pont-Tuset, J., Leal-Taixé, L., Cremers, D., Van Gool, L.: One-shot video object segmentation. In: CVPR, IEEE, (2017)
Google Scholar
Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2011)
Article Google Scholar
Zhu, S., Liu, Y.: Video scene segmentation and semantic representation using a novel scheme. Multimedia Tools and Appl. 42(2), 183–205 (2009)
Article Google Scholar
Hariharan, B., Arbelaez, P., Girshick, R., Malik, J.: Hypercolumns for object segmentation and fine-grained localization. In CVPR, pp. 447–456, (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Indian Institute of Technology, Roorkee, 247667, India
N. Kumar & N. Sukavanam

Authors

N. Kumar
View author publications
You can also search for this author in PubMed Google Scholar
N. Sukavanam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to N. Kumar .

Editor information

Editors and Affiliations

School of Engineering and Technology, BML Munjal University, Gurgaon, Haryana, India
Neha Yadav
Department of Sciences and Humanities, National Institute of Technology, Srinagar, Uttarakhand, India
Anupam Yadav
Department of Mathematics, South Asian University, New Delhi, India
Jagdish Chand Bansal
Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Kusum Deep
School of Civil, Environmental and Architectural Engineering, Korea University, Seoul, Korea (Republic of)
Joong Hoon Kim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kumar, N., Sukavanam, N. (2019). Keyframes and Shot Boundaries: The Attributes of Scene Segmentation and Classification. In: Yadav, N., Yadav, A., Bansal, J., Deep, K., Kim, J. (eds) Harmony Search and Nature Inspired Optimization Algorithms. Advances in Intelligent Systems and Computing, vol 741. Springer, Singapore. https://doi.org/10.1007/978-981-13-0761-4_74

Download citation

DOI: https://doi.org/10.1007/978-981-13-0761-4_74
Published: 24 August 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-0760-7
Online ISBN: 978-981-13-0761-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics