MuLVAT: A Video Annotation Tool Based on XML-Dictionaries and Shot Clustering

  • Zenonas Theodosiou
  • Anastasis Kounoudes
  • Nicolas Tsapatsoulis
  • Marios Milis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5769)


Recent advances in digital video technology have resulted in an explosion of digital video data which are available through the Web or in private repositories. Efficient searching in these repositories created the need of semantic labeling of video data at various levels of granularity, i.e., movie, scene, shot, keyframe, video object, etc. Through multilevel labeling video content is appropriately indexed, allowing access from various modalities and for a variety of applications. However, despite the huge efforts for automatic video annotation human intervention is the only way for reliable semantic video annotation. Manual video annotation is an extremely laborious process and efficient tools developed for this purpose can make, in many cases, the true difference. In this paper we present a video annotation tool, which uses structured knowledge, in the form of XML dictionaries, combined with a hierarchical classification scheme to attach semantic labels to video segments at various level of granularity. Video segmentation is supported through the use of an efficient shot detection algorithm; while shots are combined into scenes through clustering with the aid of a Genetic Algorithm scheme. Finally, XML dictionary creation and editing tools are available during annotation allowing the user to always use the semantic label she/he wishes instead of the automatically created ones.


video annotation hierarchical classification XML dictionaries 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Lin, C.Y., Tseng, L., Smith, R.: Video Collaboration Annotation Forum: Establishing Ground-Truth Labels on Large Multimedia Datasets. In: Proc. of NIST Text Retrieval Conference (TREC) (November 2003)Google Scholar
  2. 2.
  3. 3.
    Adams, W.H., Lin, C.Y., Iyengar, B., Tseng, B.L., Smith, J.R.: IBM Multimedia Annotation Tool. IBM Alphaworks (August 2002)Google Scholar
  4. 4.
    Bargeron, D., Gupta, A., Grudin, J., Sanocki, E.: Annotations for Streaming Video on the Web:System Design and usage Studies. In: Proc. ACM 8th Conference on World Wide Web, Torondo, Canada (1999)Google Scholar
  5. 5.
    European Cultural Heritage Online (ECHO),
  6. 6.
    Goldberg, D.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Reading (1989)MATHGoogle Scholar
  7. 7.
    ISO/IEC 15938-3:2001 Information Technology - Multimedia Content Description Interface - Part 3: Visual, Version 1Google Scholar
  8. 8.
    ISO/IEC 15938-4:2001 Information Technology - Multimedia Content Description Interface - Part 4: Audio, Version 1Google Scholar
  9. 9.
    ISO/IEC 15938-5:2003 Information Technology - Multimedia Content Description Interface - Part 5: Multimedia Description Schemes, First edn.Google Scholar
  10. 10.
    Lienhart, R.: Comparison of Automatic Shot Boundary Detection Algorithms. In: Proc. of SPIE, Storage and Retrieval for Image and Video Databases VII, San Jose, CA, USA, vol. 3656, pp. 290–301 (1999)Google Scholar
  11. 11.
    Nack, F., Putz, W.: Semi-automated Annotation of Audio-Visual Media in News. GMD Report 121 (2000)Google Scholar
  12. 12.
    Steves, M.P., Ranganathan, M., Morse, E.L.: SMAT:Synchronous Multimedia and Annotation Tool. In: Proc. of 34th Hawaii International Conference on Systems Sciences (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Zenonas Theodosiou
    • 1
  • Anastasis Kounoudes
    • 2
  • Nicolas Tsapatsoulis
    • 1
  • Marios Milis
    • 2
  1. 1.Cyprus University of TechnologyLimassolCyprus
  2. 2.SignalGeneriX LtdLimassolCyprus

Personalised recommendations