Semantic Based Adaptive Movie Summarisation

  • Reede Ren
  • Hemant Misra
  • Joemon M. Jose
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5916)


This paper proposes a framework for automatic video summarization by exploiting internal and external textual descriptions. The web knowledge base Wikipedia is used as a middle media layer, which bridges the gap between general user descriptions and exact film subtitles. Latent Dirichlet Allocation (LDA) detects as well as matches the distribution of content topics in Wikipedia items and movie subtitles. A saliency based summarization system then selects perceptually attractive segments from each content topic for summary composition. The evaluation collection consists of six English movies and a high topic coverage is shown over official trails from the Internet Movie Database.


Content-based video summarisation latent Dirichlet allocation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Capus, C., Brown, K.: Fractional fourier transform of the aussian and fractional domain signal support. Vision, Image and Signal Processing 150(2), 99–106 (2003)CrossRefGoogle Scholar
  2. 2.
    Chen, L., Rizvi, S.J., Otzu, M.: Incorporating audio cues into dialog and action scene detection. In: Proceedings of SPIE Conference on Storage and Retrieval for Media Databases, pp. 252–264 (2003)Google Scholar
  3. 3.
    Evangelopoulos, G., Maragos, P.: Multiband modulation energy tracking for noisy speech detection. IEEE Transactions on Audio, Speech, and Language Processing 14(6), 24–2038 (2006)CrossRefGoogle Scholar
  4. 4.
    Evangelopoulos, G., Rapantzikos, K., Potamianos, A., Maragos, P., Zlatintsi, A., Avrithis, Y.: Movie summarization based on audiovisual saliency detection. In: ICIP 2008, San Diego, CA, October 2008, pp. 2528–2531 (2008)Google Scholar
  5. 5.
    Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences 101(supl. 1), 5228–5235 (2004)CrossRefGoogle Scholar
  6. 6.
    Hanjalic, A., Xu, L.: Affective video content repression and model. IEEE Trans on Multimedia 7(1), 143–155 (2005)CrossRefGoogle Scholar
  7. 7.
    Heidel, A., Chang, H.-a., Lee, L.-s.: Language model adaptation using latent Dirichlet allocation and an efficient topic inference algorithm. In: European Conference on Speech Communication and Technology, Antwerp, Belgium (2007)Google Scholar
  8. 8.
    Kawai, Y., Sumiyoshi, H., Yagi, N.: Automated production of tv program trailer using electronic program guide. In: CIVR, pp. 49–56 (2007)Google Scholar
  9. 9.
    Li, Y., Lee, S.-H., Yeh, C.-H., Kuo, C.-C.: Techniques for movie content analysis and skimming: tutorial and overview on video abstraction techniques. IEEE Signal Processing Magazine 23(2), 79–89 (2006)zbMATHCrossRefGoogle Scholar
  10. 10.
    Misra, H., Cappé, O., Yvon, F.: Using LDA to detect semantically incoherent documents. In: Conference on Computational Natural Language Learning, Manchester, U.K. (2008)Google Scholar
  11. 11.
    Misra, H., Yvon, F., Jose, J., Cappe, O.: Text segmentation via topic modeling: An analytical study. In: CIKM 2009 (2009)Google Scholar
  12. 12.
    Money, A.G., Agius, H.: Video summarisation: A conceptual framework and survey of the state of the art. J. Vis. Comun. Image Represent. 19(2), 121–143 (2008)CrossRefGoogle Scholar
  13. 13.
    Over, P., Smeaton, A.F., Awad, G.: The trecvid 2008 rushes summarization evaluation. In: TVS 2008, Vancouver, British Columbia, Canada, pp. 1–20. ACM, New York (2008)CrossRefGoogle Scholar
  14. 14.
    Ren, R., Swamy, P.P., Jose, J.M., Urban, J.: Attention-based video summarisation in rushes collection. In: TVS, pp. 89–93 (2007)Google Scholar
  15. 15.
    Ronfard, R., Tran-Thuong, T.: A framework for aligning and indexing movies with their script. In: IEEE International Conference on Multimedia and Expo., Baltimore, USA, July 2003, pp. 21–24 (2003)Google Scholar
  16. 16.
    Smeaton, A.F., Lehane, B., O’Connor, N.E., Brady, C., Craig, G.: Automatically selecting shots for action movie trailers. In: MIR 2006, pp. 231–238. ACM, New York (2006)CrossRefGoogle Scholar
  17. 17.
    Sundaram, H., Chang, S.-F.: Determining computable scenes in films and their structures using audio-visual memory models. In: ACM Multimedia, pp. 95–104. ACM, New York (2000)Google Scholar
  18. 18.
    Utiyama, M., Isahara, H.: A statistical model for domain-independent text segmentation. In: Meeting of the Association for Computational Linguistics, pp. 491–498 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Reede Ren
    • 1
  • Hemant Misra
    • 1
  • Joemon M. Jose
    • 1
  1. 1.Information Retrieval GroupUniversity of GlasgowGlasgowUK

Personalised recommendations