Automatic Appropriate Segment Extraction from Shots Based on Learning from Example Videos

  • Yousuke Kurihara
  • Naoko Nitta
  • Noboru Babaguchi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5414)


Videos are composed of shots, each of which is recorded continuously by a camera, and video editing can be considered as a process of re-sequencing shots selected from original videos. Shots usually include redundant intervals, which are often edited out by professional editors. Defining the intact interval which is used in the edited video as the appropriate segment and all other intervals of equal length as inappropriate segments, this paper proposes a method for automatically extracting appropriate segments from shots. Since what kinds of characteristics make an interval appropriate to be used in the edited video should be different among shots with different content, the proposed method firstly categorizes shots according to their content with Support Vector Machines. Then, the appropriate segments are extracted based on the temporal patterns of audio and visual features in appropriate and inappropriate segments learned with Hidden Markov Models for each shot category. The effectiveness of the proposed method is verified with experiments.


video editing segment extraction shot categorization example videos 


  1. 1.
    Takahashi, Y., Nitta, N., Babaguchi, N.: Video Summarization for Large Sports Video Archives. In: Proceedings of ICME 2005 (July 2005)Google Scholar
  2. 2.
    Chen, H.W., Kuo, J.H., Chu, W.T., Wu, J.L.: Action Movies Segmentation and Summarization Based on Tempo Analysis. In: Proceedings of MIR 2004, pp. 251–258 (October 2004)Google Scholar
  3. 3.
    Takemoto, R., Yoshitaka, A., Hirashima, T.: Hirashima: Video Editing based on Movie Effects by Shot Length Transition, Technical Report of IEICE PRMU 2005-149-183, pp.19–24 (January 2006)Google Scholar
  4. 4.
    Hua, X.S., Lu, L., Zhang, H.J.: Optimization-Based Automated Home Video Editing System. IEEE Transactions on TCSVT 2004 14(5) (May 2004)Google Scholar
  5. 5.
    Aoyanagi, S., Kourai, K., Sato, K., Takada, T., Sgawara, T.: Evaluation of New Video Skimming Method Using Audio and Video Information. In: Proceedings of DEWS 2003 2-A-01 (March 2003)Google Scholar
  6. 6.
    Lienhart, R.: Abstracting Home Video Automatically. In: Proceedings of ACMMM 1999, pp. 37–40 (1999)Google Scholar
  7. 7.
    Foote, J., Cooper, M., Girgenshon, A.: Creating Music Videos using Automatic Media Analysis. In: Proceedings of ACMMM 2002, pp. 553–560 (December 2002)Google Scholar
  8. 8.
    Rasheed, Z., Shah, M.: Video Categorization Using Semantics and Semiotics. In: VIDEO MINING, pp. 185–217. Kluwer Academic Publishers, Dordrecht (2003)CrossRefGoogle Scholar
  9. 9.
    Wang, Y., Liu, Z., Huang, J.C.: Multimedia Content Analysis Using Both Audio and Visual Clues. IEEE Signal Processing Magagine, 12–36 (November 2000)Google Scholar
  10. 10.
  11. 11.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of CVPR 2001, pp. 511–518 (December 2001)Google Scholar
  12. 12.
    Kurihara, Y., Nitta, N., Babaguchi, N.: Appropriate segment extraction from shots based on temporal patterns of example videos. In: Satoh, S., Nack, F., Etoh, M. (eds.) MMM 2008. LNCS, vol. 4903, pp. 253–264. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  13. 13.
    Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceeding IEEE 77, 257–285 (1989)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Yousuke Kurihara
    • 1
  • Naoko Nitta
    • 1
  • Noboru Babaguchi
    • 1
  1. 1.Graduate School of EngineeringOsaka UniversityOsakaJapan

Personalised recommendations