Multimedia Tools and Applications

, Volume 47, Issue 2, pp 325–346 | Cite as

Scene pathfinder: unsupervised clustering techniques for movie scenes extraction

  • Mehdi EllouzeEmail author
  • Nozha Boujemaa
  • Adel M. Alimi


The need for watching movies is in perpetual increase due to the widespread of the internet and the increasing popularity of the video on demand service. The important mass of movies stored in the Internet or in VOD servers need to be structured to accelerate the browsing operation. In this paper, we propose a new system called "The Scene Pathfinder" that aims at segmenting the movies into scenes to give users the opportunity to have a non- sequential access and to watch particular scenes of the movie. This helps them to judge quickly the movie and decide if they have to buy or to download it and avoiding waste of time and money. The proposed approach is multimodal. We use both of visual and auditory information to accomplish the segmentation. We base on the assumption that every movie scene is either action or non- action scene. Non-action scenes are generally characterized by static backgrounds and occur in the same place. For this reason, we base on the content information and on the Kohonen map to extract these kinds of scenes (shots agglomerations). Action scenes are characterized by high tempo and motion. For this reason, we base on tempo features and on the Fuzzy CMeans to classify shots and to localize the action zones. The two processes are complementary. Indeed, the over segmentation that may occur in the extraction of action scenes by basing on the content information is repaired by the Fuzzy clustering. Our system is tested on a varied database and obtained results show the merit of our approach and that our assumptions are well-founded.


Video Scene detection Video browsing Movies- shots clustering Video processing Video segmentation 



The authors would like to thank several individuals and groups for making the implementation of this system possible. The authors would like to acknowledge the financial support of this work by grants from the General Direction of Scientific Research and Technological Renovation (DGRSRT), Tunisia, under the ARUB program 01/UR/11/02. We are also grateful, to EGIDE and INRIA, France, for sponsoring this work and the three-month research placement of Mehdi Ellouze from 1/11/2007 to 31/1/2008 in INRIA IMEDIA Team in which parts of this work were done.


  1. 1.
    Arijon D (1991) Grammar of the Film Language. Silman James Press, Los AngelesGoogle Scholar
  2. 2.
    Bezdek JC (1981) Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum, New YorkzbMATHGoogle Scholar
  3. 3.
    Bordwell, D, Thompson K (1997) Film Art: An Introduction, 5th edn. McGraw-HillGoogle Scholar
  4. 4.
    Boujemaa N, Fauqueur J, Ferecatu M, Fleuret F, Gouet V, Saux BL, Sahbi H (2001) Ikona: Interactive generic and specific image retrieval. In: Proceedings of the International workshop on MultimediaGoogle Scholar
  5. 5.
    Brunelli R, Mich O, Modena CM (1999) A survey on the automatic indexing of video data. Journal of Visual Communication Image Represent 10:78–112CrossRefGoogle Scholar
  6. 6.
    Chen L, Ozsu MT (2002) Rule-based scene extraction from video. In International Conference on Image Processing, pp 737-740Google Scholar
  7. 7.
    Chen SC, Shyu ML, Zhang CC, Kashyap RL (2001) Video Scene change detection method using unsupervised segmentation and object tracking, In Proceedings of IEEE International Conference on Multimedia and Expo, pp 56-59Google Scholar
  8. 8.
    Chen HW, Kuo JH, Chu WT, Wu JL (2004) Action Movies Segmentation and Summarization Based on Tempo Analysis, In Proceedings of the ACM SIGMM International Workshop on Multimedia Information Retrieval, pp 251-258Google Scholar
  9. 9.
    Chen LH, Lai YC, Liao HYM (2008) Movie scene segmentation using background information. Pattern Recognition 41:1056–1065zbMATHCrossRefGoogle Scholar
  10. 10.
    Cotsaces C, Nikolaidis N, Pitas I (2006) Video Shot Detection and Condensed Representation, A review. IEEE Signal Processing Magazine 23, pp 28–37Google Scholar
  11. 11.
    Ellouze M, Karray H, Alimi AM (2006) Genetic Algorithm For Summarizing News Stories. In Proceedings of international conference on computer vision theory and applications, pp 303-308Google Scholar
  12. 12.
    Ellouze M, Karray H, Alimi AM (2008) REGIM, Research Group on Intelligent Machines, Tunisia, at TRECVID 2008, BBC Rushes Summarization, In Proceedings of international conference ACM Multimedia, TRECVID BBC Rushes Summarization WorkshopGoogle Scholar
  13. 13.
    Ellouze M, Karray H, Soltana WB, Alimi AM (2007) Utilisation de la carte de Kohonen pour la détection des plans présentateur d’un journal télévisé, In Proceedings of international conference TAIMA 2007, cinquième édition des ateliers de travail sur le traitement et l'analyse de l’information, pp 271-276Google Scholar
  14. 14.
    Geng Y, Xu D, Wu A (2005) Effective Video Scene Detection Approach Based on Cinematic Rules. In Proceedings 9th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, pp 1197-1203Google Scholar
  15. 15.
    Hanjalic A, Lagendijk RL, Biemond J (1999) Automated high-level movie segmentation for advanced video-retrieval systems. IEEE Transaction Circuits and Systems for Video Technology 9:580–588CrossRefGoogle Scholar
  16. 16.
    Hanjalic A (2002) Shot-boundary detection: unraveled and resolved? IEEE Transactions on Circuits and Systems for Video Technology 12:90–105CrossRefGoogle Scholar
  17. 17.
    Huang J, Liu Z, Wang Y (1998) Integration of Audio and Visual Information for Content-based Video Segmentation. In Proceedings of IEEE International Conference on Image Processing, pp 526–529Google Scholar
  18. 18.
    IMDB (2008), Last viewed July 2008
  19. 19.
    Karray H, Ellouze M, Alimi AM (2008) KKQ: K-frames and K-words extraction for quick news story browsing. International Journal of Information and Communication Technology 1, pp. 69–76Google Scholar
  20. 20.
    Karray H, Ellouze M, Alimi AM (2008) Indexing video summaries for quick video browsing. Chapter in Computer Communications and Networks published by Springer Verlag, Germany. In PressGoogle Scholar
  21. 21.
    Kender JR, Yeo BL (1998) Video Scene Segmentation Via Continuous Video Coherence, In Proceedings of the conference of Computer Vision and Pattern Recognition, pp 367–373Google Scholar
  22. 22.
    Kherallah M, Karray H, Ellouze M, Alimi AM (2008) Toward an Interactive Device for Quick News Story Browsing. In Proceedings of international conference on pattern recognition. AcceptedGoogle Scholar
  23. 23.
    Kohonen T (1990) The Self-Organizing Map. In Proceedings of the IEEE, pp 1464-1480Google Scholar
  24. 24.
    Lehane B, O’Connor NE (2006) Movie Indexing via Event Detection. In Proceedings of the Workshop on image analysis for multimedia interactive services, pp 1-4Google Scholar
  25. 25.
    Lin T, Zhang HJ (2000) Automatic Video Scene Extraction by Shot Grouping. In proceedings of the International Conference of Pattern Recognition 6:39–42MathSciNetGoogle Scholar
  26. 26.
    Lin T, Zhang HJ, Shi QY (2001) Video scene extraction by force competition, In the proceedings of IEEE International Conference on Multimedia and Expo, pp 753-756Google Scholar
  27. 27.
    Lu L, Zhang HJ, Jiang H (2002) Content Analysis for Audio Classification and Segmentation. IEEE Transactions on Speech and Audio Processing 10:504–516CrossRefGoogle Scholar
  28. 28.
    Lukas B, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In Proceedings of the International Joint Conference on Artificial Intelligence, pp 674–679Google Scholar
  29. 29.
    Nagasaka A, Tanaka Y (1991) Automatic scene-change detection method for video works. In 2ndWorking Conference on Visual Database Systems, pp 119–133Google Scholar
  30. 30.
    Ngo CW, Pong TC, Zhang HJ (2002) Motion-Based Video Representation for Scene Change Detection. International Journal of Computer Vision 2:127–142CrossRefGoogle Scholar
  31. 31.
    Oh J, Hua KA, Liang N (2000) A content-based scene change detection and classification technique using background tracking, In Proceedings of the conference on Multimedia Computing and Networking, pp 254-265Google Scholar
  32. 32.
    Rasheed Z, Shah M (2005) Detection and Representation of Scenes in Videos. IEEE Transaction on Multimedia 7:1097–1105CrossRefGoogle Scholar
  33. 33.
    Rui Y, Huang TS, Mehrotra S (1998) Constructing table of contents for videos. ACM J. Multimedia Systems, pp 359–368Google Scholar
  34. 34.
    Smeaton AF, Lehane B, O'Connor NE, Brady C, Craig G (2006) Automatically selecting shots for action movie trailers. In Proceedings of the ACM international workshop on Multimedia information , pp 231-238Google Scholar
  35. 35.
    Snoek CGM, Worring M, Geusebroek JM, Koelma DC, Seinstra FJ, Smeulders AWM (2006) The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing. IEEE Transactions on Pattern Analysis and Machine Intelligence 28:1678–1689CrossRefGoogle Scholar
  36. 36.
    Studio4networks (2008), Last viewed July 2008
  37. 37.
    Sundaram H, Chang SF (2000) Video Scene Segmentation Using Video and Audio Features. In Proceedings of the International Conference on Multimedia and Expo, pp1145-1148Google Scholar
  38. 38.
    Tavanapong W, Zhou J (2004) Shot clustering techniques for story browsing. IEEE Transactions on Multimedia 6:517–527CrossRefGoogle Scholar
  39. 39.
    TRECVID (2008), Last viewed July 2008
  40. 40.
    Truong BT, Dorai C, Venkatesh S (2003) Automatic scene extraction in motion pictures. IEEE Transactions in Circuits and Systems for Video Technology 1:5–10CrossRefGoogle Scholar
  41. 41.
    Yale film studies, 2008,, Last viewed July 2008
  42. 42.
    Yeung M, Yeo BL, Liu B (1998) Segmentation of video by clustering and graph analysis, Computer Vision and Image Understanding 71, pp 94-109Google Scholar
  43. 43.
    Zhao L, Yang SQ, Feng B (2001) Video scene detection using slide windows method based on temporal constrain shot similarity. In Proceedings of international conference on Multimedia and Expo, pp 1171–1174Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.REGIM: Research Group on Intelligent MachinesUniversity of SfaxSfaxTunisia
  2. 2.INRIA: IMEDIA TeamLe Chesnay CedexFrance

Personalised recommendations