Learning to Segment a Video to Clips Based on Scene and Camera Motion

Kowdle, Adarsh; Chen, Tsuhan

doi:10.1007/978-3-642-33712-3_20

Adarsh Kowdle²¹ &
Tsuhan Chen²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7574))

Included in the following conference series:

European Conference on Computer Vision

9496 Accesses
8 Citations

Abstract

In this paper, we present a novel learning-based algorithm for temporal segmentation of a video into clips based on both camera and scene motion, in particular, based on combinations of static vs. dynamic camera and static vs. dynamic scene. Given a video, we first perform shot boundary detection to segment the video to shots. We enforce temporal continuity by constructing a Markov Random Field (MRF) over the frames of each video shot with edges between consecutive frames and cast the segmentation problem as a frame level discrete labeling problem. Using manually labeled data we learn classifiers exploiting cues from optical flow to provide evidence for the different labels, and infer the best labeling over the frames. We show the effectiveness of the approach using user videos and full-length movies. Using sixty full-length movies spanning 50 years, we show that the proposed algorithm of grouping frames purely based on motion cues can aid computational applications such as recovering depth from a video and also reveal interesting trends in movies, which finds itself interesting novel applications in video analysis (time-stamping archive movies) and film studies.

Download to read the full chapter text

Chapter PDF

First International Workshop on Video Segmentation - Panel Discussion

Video Pop-up: Monocular 3D Reconstruction of Dynamic Scenes

Video Co-segmentation

Keywords

References

Cinemetrics, http://www.cinemetrics.lv
Bagon, S.: Matlab wrapper for graph cut (December 2006)
Google Scholar
Bordwell, D.: The Way Hollywood Tells It: Story and Style in Modern Movies. A Hodder Arnold Publication (2006)
Google Scholar
Boreczky, J.S., Rowe, L.A.: Comparison of video shot boundary detection techniques. In: SPIE (1996)
Google Scholar
Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. PAMI 26(9), 1124–1137 (2004)
Article Google Scholar
Boykov, Y., Veksler, O., Zabih, R.: Efficient approximate energy minimization via graph cuts. PAMI 20(12), 1222–1239 (2001)
Article Google Scholar
Chan, A.B., Vasconcelos, N.: Modeling, clustering, and segmenting video with mixtures of dynamic textures. PAMI 30(5), 909–926 (2008)
Article Google Scholar
Cutting, J.E., DeLong, J.E., Brunick, K.L.: Visual activity in hollywood film: 1935 to 2005 and beyond. PACA 5(2) (2010)
Google Scholar
Dementhon, D.: Spatio-temporal segmentation of video by hierarchical mean shift analysis. In: SMVP (2002)
Google Scholar
Dorai, C., Kobla, V.: Extracting motion annotations from mpeg-2 compressed video for hdtv content management applications. In: ICMCS (1999)
Google Scholar
Elsaesser, T., Buckland, W.: Studying Contemporary American Film: A Guide to Movie Analysis. University of California Press (2002)
Google Scholar
García Cifuentes, C., Sturzel, M., Jurie, F., Brostow, G.J.: Motion models that only work sometimes. In: BMVC (2012)
Google Scholar
Gargi, U., Kasturi, R., Antani, S.: Performance characterization and comparison of video indexing algorithms. In: CVPR (1998)
Google Scholar
Grundmann, M., Kwatra, V., Han, M., Essa, I.: Efficient hierarchical graph-based video segmentation. In: CVPR (2010)
Google Scholar
Hanjalic, A., Lagendijk, R.L., Member, S., Biemond, J.: Automated high-level movie segmentation for advanced video-retrieval systems. CSVT (1999)
Google Scholar
Jain, R.: Direct computation of the focus of expansion. PAMI 5(1), 58–64 (1983)
Article Google Scholar
Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts? PAMI 26(2), 147–159 (2004)
Article Google Scholar
Li, Y., Sun, J., Shum, H.-Y.: Video object cut and paste. In: ACM SIGGRAPH (2005)
Google Scholar
Lienhart, R.: Reliable transition detection in videos: A survey and practitioners guide. International Journal of Image and Graphics 1, 469–486 (2001)
Article Google Scholar
Liu, C.: Beyond Pixels: Exploring New Representations and Applications for Motion Analysis. PhD thesis, MIT (2009)
Google Scholar
Ngo, C.-W., Pong, T.-C., Zhang, H.-J., Chin, R.T.: Motion characterization by temporal slices analysis. In: CVPR (2000)
Google Scholar
Otsuji, K., Tonomura, Y.: Projection detecting filter for video cut detection. ACM Multimedia (1993)
Google Scholar
Peng Tan, Y., Saur, D.D., Kulkarni, S.R., Member, S., Ramadge, P.J.: Rapid estimation of camera motion from compressed video with application to video annotation. CSVT 10, 133–146 (2000)
Google Scholar
Rasheed, Z., Shah, M.: Scene detection in hollywood movies and tv shows. In: CVPR (2003)
Google Scholar
Salt, B.: Statistical style analysis of motion pictures. Film Quarterly, 28(1) (1974)
Google Scholar
Schindler, G., Dellaert, F.: Probabilistic temporal inference on reconstructed 3d scenes. In: CVPR (2010)
Google Scholar
Snavely, N., Seitz, S., Szeliski, R.: Photo tourism: Exploring photo collections in 3d. In: SIGGRAPH (2006)
Google Scholar
Srinivasan, M.V., Venkatesh, S., Hosie, R.: Qualitative estimation of camera motion parameters from video sequences. Pattern Recognition 30(4), 593–606 (1997)
Article Google Scholar
Wang, J., Thiesson, B., Xu, Y.-Q., Cohen, M.: Image and Video Segmentation by Anisotropic Kernel Mean Shift. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004, Part II. LNCS, vol. 3022, pp. 238–249. Springer, Heidelberg (2004)
Chapter Google Scholar
Xiong, W., Lee, J.C.-M.: Efficient scene change detection and camera motion annotation for video classification. CVIU 71(2), 166–181 (1998)
Google Scholar
Yeung, M., Yeo, B.-L., Liu, B.: Segmentation of video by clustering and graph analysis. CVIU 71, 94–109 (1998)
Google Scholar
Yuan, J., Wang, H., Xiao, L., Zheng, W., Li, J., Lin, F., Zhang, B.: A formal study of shot boundary detection. TCSVT (2007)
Google Scholar
Zabih, R., Miller, J., Mai, K.: A feature-based algorithm for detecting and classifying scene breaks. ACM Multimedia (1995)
Google Scholar
Zhang, H., Kankanhalli, A., Smoliar, S.W.: Automatic partitioning of full-motion video. Multimedia Syst. 1(1), 10–28 (1993)
Article Google Scholar
Zhu, X., Elmagarmid, A.K., Xue, X., Wu, L., Catlin, A.C.: Insightvideo: toward hierarchical video content organization for efficient browsing, summarization and retrieval. IEEE Transactions on Multimedia 7(4), 648–666 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Cornell University, Ithaca, NY, USA
Adarsh Kowdle & Tsuhan Chen

Authors

Adarsh Kowdle
View author publications
You can also search for this author in PubMed Google Scholar
Tsuhan Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Ltd., CB3 0FB, Cambridge, UK
Andrew Fitzgibbon
Dept. of Computer Science, University of North Carolina, 27599, Chapel Hill, NC, USA
Svetlana Lazebnik
California Institute of Technology, 91125, Pasadena, CA, USA
Pietro Perona
Institute of Industrial Science, The University of Tokyo, 153-8505, Tokyo, Japan
Yoichi Sato
INRIA, 38330, Montbonnot, France
Cordelia Schmid

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kowdle, A., Chen, T. (2012). Learning to Segment a Video to Clips Based on Scene and Camera Motion. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33712-3_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-33712-3_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33711-6
Online ISBN: 978-3-642-33712-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning to Segment a Video to Clips Based on Scene and Camera Motion

Abstract

Chapter PDF

Similar content being viewed by others

First International Workshop on Video Segmentation - Panel Discussion

Video Pop-up: Monocular 3D Reconstruction of Dynamic Scenes

Video Co-segmentation

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Learning to Segment a Video to Clips Based on Scene and Camera Motion

Abstract

Chapter PDF

Similar content being viewed by others

First International Workshop on Video Segmentation - Panel Discussion

Video Pop-up: Monocular 3D Reconstruction of Dynamic Scenes

Video Co-segmentation

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation