Multimedia Ontology Based Computational Framework for Video Annotation and Retrieval

  • Alberto Del Bimbo
  • Marco Bertini
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4577)


Ontologies are defined as the representation of the semantics of terms and their relationships. Traditionally, they consist of concepts, concept properties, and relationships between concepts, all expressed in linguistic terms. In order to support effectively video annotation and content-based retrieval the traditional linguistic ontologies should be extended to include structural video information and perceptual elements such as visual data descriptors.

These extended ontologies (referred in the following as multimedia ontologies) should support definition of visual concepts as representatives of specific patterns of a linguistic concept. While the linguistic part of the ontology embeds permanent and objective items of the domain, the perceptual part includes visual concepts that are dependent on temporal experience and are subject to changes with time and perception. This is the reason why dynamic update of visual concepts has to be supported by multimedia ontologies, to represent temporal evolution of concepts.


Linguistic Term Domain Ontology Video Retrieval Visual Concept Perceptual Fact 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Naphade, M.R., Wang, R., Huang, T.S.:Multimodal pattern matching for audio-visual query and retrieval. In: Proc. of SPIE Storage and Retrieval for Media Database (2001)Google Scholar
  2. 2.
    Zhang, H., Wang, A., Altunbask, Y.: Content based video retrieval and compression: A unified solution. In: Proc. of the IEEE International Conference on Image Processing (1997)Google Scholar
  3. 3.
    Chang, S.F., Chen, W., Meng, H.J., Sundaram, H., Zhong, D.: Videoq: An automated content based video search system using visual cues. In: Proc. of the IEEE International Conference on Image Processing (1997)Google Scholar
  4. 4.
    Eickeler, S., Muller, S.: Content-based video indexing of tv broadcast news using hidden markov models. In: Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 2997–3000 (1999)Google Scholar
  5. 5.
    Sato, T., Kanade, T., Hughes, E.K., Smith, M.A.: Video ocr for digital news archive. In: IEEE International Workshop on Content–Based Access of Image and Video Databases CAIVD’ 98, pp. 52–60 (1998)Google Scholar
  6. 6.
    Hauptmann, A., Witbrock, M.: Informedia: News–on–demand multimedia information acquisition and retrieval. Intelligent Multimedia Information Retrieval, 213–239 (1997)Google Scholar
  7. 7.
    Ekin, A., Tekalp, A.M., Mehrotra, R.: Automatic soccer video analysis and summarization. IEEE Transactions on Image Processing 12, 796–807 (2003)CrossRefGoogle Scholar
  8. 8.
    Yu, X., Xu, C., Leung, H., Tian, Q., Tang, Q., Wan, K.W.: Trajectory-based ball detection and tracking with applications to semantic analysis of broadcast soccer video. In: ACM Multimedia 2003. Berkeley, CA (USA), 4-6 November 2003, vol. 3, pp. 11–20 (2003)Google Scholar
  9. 9.
    Marchionini, G., Geisler, G.: The open video digital library. D-Lib Magazine 8(11) (December 2002)Google Scholar
  10. 10.
    European cultural heritage online (echo). Technical report (2002),
  11. 11.
    Gruber, T.: Principles for the design of ontologies used for knowledge sharing. Int. Journal of Human-Computer Studies 43, 907–928 (1995)CrossRefGoogle Scholar
  12. 12.
    Athanasiadis, T., Tzouvaras, V., Petridis, K., Precioso, F., Avrithis, Y., Kompatsiaris, Y.: Using a multimedia ontology infrastructure for semantic annotation of multimedia content. In: Proc. of 5th International Workshop on Knowledge Markup and Semantic Annotation (SemAnnot ’05), Galway, Ireland (November 2005)Google Scholar
  13. 13.
    Jaimes, A., Tseng, B., Smith, J.: Modal keywords, ontologies, and reasoning for video understanding. In: Int’l Conference on Image and Video Retrieval (CIVR) (July 2003)Google Scholar
  14. 14.
    Benitez, A., Chang, S.F.: Automatic multimedia knowledge discovery, summarization and evaluation. IEEE Transactions on Multimedia (Submitted)Google Scholar
  15. 15.
    Kender, J., Naphade, M.: Visual concepts for news story tracking: Analyzing and exploiting the nist trecvid video annotation experiment. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). vol. 1, pp. 1174–1181 (2005)Google Scholar
  16. 16.
    Naphade, M., Smith, J., Tesic, J., Chang, S., Kennedy, L., Hauptmann, A., Curtis, J.: Large-scale concept ontology for multimedia. IEEE Multimedia 13, 86–91 (2006)CrossRefGoogle Scholar
  17. 17.
    Lenat, D., Guha, R.: Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley, Reading (1990)Google Scholar
  18. 18.
    Strintzis, J., Bloehdorn, S., Handschuh, S., Staab, S., Simou, N., Tzouvaras, V., Petridis, K., Kompatsiaris, I., Avrithis, Y.: Knowledge representation for semantic multimedia content analysis and reasoning. In: European Workshop on the Integration of Knowledge, Semantics and Digital Media Technology (November 2004)Google Scholar
  19. 19.
    Vembu, S., Kiesel, M., Sintek, M., Bauman, S.: Towards bridging the semantic gap in multimedia annotation and retrieval. In: Proc. First International Workshop on Semantic Web Annotations for Multimedia (SWAMM), Edinburgh (Scotland) (May 2006)Google Scholar
  20. 20.
    Mezaris, V., Kompatsiaris, I., Boulgouris, N., Strintzis, M.: Real-time compressed-domain spatiotemporal segmentation and ontologies for video indexing and retrieval. IEEE Transactions on Circuits and Systems for Video Technology 14, 606–621 (2004)CrossRefGoogle Scholar
  21. 21.
    Simou, N., Saathoff, C., Dasiopoulou, S., Spyrou, E., Voisine, N., Tzouvaras, V., Kompatsiaris, I., Avrithis, Y., Staab, S.: An ontology infrastructure for multimedia reasoning. In: Proc. International Workshop VLBV 2005, Sardinia (Italy) (September 2005)Google Scholar
  22. 22.
    Jaimes, A., Smith, J.: Semi-automatic, data-driven construction of multimedia ontologies. In: Proc. of IEEE Int’l Conference on Multimedia & Expo. (2003)Google Scholar
  23. 23.
    Dasiopoulou, S., Mezaris, V., Kompatsiaris, I., Papastathis, V.K., Strintzis, M.G.: Knowledge-assisted semantic video object detection. IEEE Transactions on Circuits and Systems for Video Technology 15, 1210–1224 (2005)CrossRefGoogle Scholar
  24. 24.
    Snoek, C., Huurnink, B., Hollink, L., de Rijke, M., Schreiber, G., Worring, M.: Adding semantics to detectors for video retrieval. IEEE Transactions on Multimedia (Pending minor revision) (2007)Google Scholar
  25. 25.
    Bertini, M., Cucchiara, R., Del Bimbo, A., Torniai, C.: Video annotation with pictorially enriched ontologies. In: Proc. of IEEE Int’l Conference on Multimedia & Expo. (2005)Google Scholar
  26. 26.
    Grana, C., Bulgarelli, D., Cucchiara, R.: Video clip clustering for assisted creation of mpeg-7 pictorially enriched ontologies. In: Proc. Second International Symposium on Communications, Control and Signal Processing (ISCCSP), Marrakech, Morocco (March 2006)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Alberto Del Bimbo
    • 1
  • Marco Bertini
    • 1
  1. 1.Università di Firenze - Italy, Via S.Marta, 3 - 50139 Firenze 

Personalised recommendations