A Survey of Semantic Image and Video Annotation Tools

  • Stamatia Dasiopoulou
  • Eirini Giannakidou
  • Georgios Litos
  • Polyxeni Malasioti
  • Yiannis Kompatsiaris
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6050)


The availability of semantically annotated image and video assets constitutes a critical prerequisite for the realisation of intelligent knowledge management services pertaining to realistic user needs. Given the extend of the challenges involved in the automatic extraction of such descriptions, manually created metadata play a significant role, further strengthened by their deployment in training and evaluation tasks related to the automatic extraction of content descriptions. The different views taken by the two main approaches towards semantic content description, namely the Semantic Web and MPEG-7, as well as the traits particular to multimedia content due to the multiplicity of information levels involved, have resulted in a variety of image and video annotation tools, adopting varying description aspects. Aiming to provide a common framework of reference and furthermore to highlight open issues, especially with respect to the coverage and the interoperability of the produced metadata, in this chapter we present an overview of the state of the art in image and video annotation tools.


Annotation Tool Image Annotation Semantic Annotation Video Annotation Descriptive Metadata 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Smeulders, A., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22, 1349–1380 (2000)CrossRefGoogle Scholar
  2. 2.
    Hauptmann, A., Yan, R., Lin, W.: How many high-level concepts will fill the semantic gap in news video retrieval? In: 6th ACM International Conference on Image and Video Retrieval (CIVR), Amsterdam, The Netherlands, pp. 627–634 (2007)Google Scholar
  3. 3.
    Snoek, C., Huurnink, B., Hollink, L., de Rijke, M., Schreiber, G., Worring, M.: Adding semantics to detectors for video retrieval. IEEE Transactions on Multimedia 9, 975–986 (2007)CrossRefGoogle Scholar
  4. 4.
    Hanjalic, A., Lienhart, R., Ma, W., Smith, J.: The holy grail of multimedia information retrieval: So close or yet so far away. IEEE Proceedings, Special Issue on Multimedia Information Retrieval 96, 541–547 (2008)Google Scholar
  5. 5.
    Nack, J.: Mpeg-7: Overview of description tools. IEEE MultiMedia 9, 83–93 (2002)CrossRefGoogle Scholar
  6. 6.
    Salembier, P., Manjunath, B., Sikora, T.: Introduction to MPEG 7: Multimedia Content Description Language (2002)Google Scholar
  7. 7.
    van Ossenbruggen, J., Nack, F., Hardman, L.: That obscure object of desire: Multimedia metadata on the web, part 1. IEEE MultiMedia 11, 38–48 (2004)CrossRefGoogle Scholar
  8. 8.
    Nack, F., van Ossenbruggen, J., Hardman, L.: That obscure object of desire: Multimedia metadata on the web, part 2. IEEE MultiMedia 12, 54–63 (2005)CrossRefGoogle Scholar
  9. 9.
    Hunter, J.: Adding Multimedia to the Semantic Web: Building an MPEG-7 Ontology. In: Proc. The First Semantic Web Working Symposium (SWWS), California, USA (July 2001)Google Scholar
  10. 10.
    Tsinaraki, C., Polydoros, P., Christodoulakis, S.: Integration of OWL ontologies in MPEG-7 and TV-anytime compliant semantic indexing. In: Persson, A., Stirna, J. (eds.) CAiSE 2004. LNCS, vol. 3084, pp. 398–413. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. 11.
    Garcia, R., Semantic Integration, O.C.: Retrieval of Multimedia Metadata. In: Proc. International Semantic Web Conference (ISWC), Galway, Ireland (2005)Google Scholar
  12. 12.
    Dasiopoulou, S., Tzouvaras, V., Kompatsiaris, I., Strintzis, M.: Capturing mpeg-7 semantics. In: Proc. International Conference on Metadata and Semantics (MTSR), Corfu, Greece (2007)Google Scholar
  13. 13.
    Arndt, R., Troncy, R., Staab, S., Hardman, L., Vacura, M.: COMM: Designing a well-founded multimedia ontology for the web. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 30–43. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  14. 14.
    Jorgensen, C., Jaimes, A., Benitez, A., Chang, S.: A conceptual framework and empirical reserach for classifying visual descriptors. J. of the American Society for Information Science and Technology (JASIST) 52, 938–947 (2001)CrossRefGoogle Scholar
  15. 15.
    Hollink, L., Schreiber, G., Wielinga, B., Worring, M.: Classification of user image descriptions. Int. J. Hum.-Comput. Stud. 61, 601–626 (2006)CrossRefGoogle Scholar
  16. 16.
    Saathoff, C., Schenk, S., Scherp, A.: Kat: the k-space annotation tool. Poster Session, Int. Conf. on Semantic and Digital Media Technologies (SAMT), Koblenz, Germany (2008)Google Scholar
  17. 17.
    Gangemi, A., Guarino, N., Masolo, C., Oltramari, A., Schneider, L.: Sweetening ontologies with DOLCE. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 166–181. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  18. 18.
    Gangemi, A.: Ontology design patterns for semantic web content. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 262–276. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  19. 19.
    MPEG-7 MDS: ISO/IEC 15938-5:2003 information technology. Multimedia Content Description Interface - Part 5: Multimedia Description Schemes, 1st Edition (2001)Google Scholar
  20. 20.
    MPEG-7 Visual: ISO/IEC 15938-3:2001 information technology. Multimedia Content Description Interface - Part 3: Visual, 1st Edition (2001)Google Scholar
  21. 21.
    Halaschek-Wiener, C., Golbeck, J., Schain, A., Grove, M., Parsia, B., Hendler, J.: Annotation and provenance tracking in semantic web photo libraries. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 82–89. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  22. 22.
    Chakravarthy, A., Ciravegna, F., Lanfranchi, V.: Aktivemedia: Cross-media document annotation and enrichment. In: Poster Proceedings of 5th International Semantic Web Conference (ISWC), Athens, GA, USA (2006)Google Scholar
  23. 23.
    Petridis, K., Anastasopoulos, D., Saathoff, C., Timmermann, N., Kompatsiaris, Y., Staab, S.: M-ontoMat-annotizer: Image annotation linking ontologies and multimedia low-level features. In: Gabrys, B., Howlett, R.J., Jain, L.C. (eds.) KES 2006. LNCS (LNAI), vol. 4253, pp. 633–640. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  24. 24.
    Simou, N., Tzouvaras, V., Avrithis, Y., Stamou, G., Kollias, S.: A visual descriptor ontology for multimedia reasoning. In: Proc. of Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), Montreux, Switzerland (2005)Google Scholar
  25. 25.
    Lux, M., Becker, J., Krottmaier, H.: Caliph & emir: Semantic annota-tion and retrieval in personal digital photo libraries. In: Eder, J., Missikoff, M. (eds.) CAiSE 2003. LNCS, vol. 2681. Springer, Heidelberg (2003)Google Scholar
  26. 26.
    MPEG-7: ISO/IEC 15938. Multimedia Content Descritpion Interface (2001)Google Scholar
  27. 27.
    Miller, M., McCathieNevile, C.: Semantic web tools to help authoring: A semantic web image annotation tool. In: SWAD-Europe Deliverable 9.3 (2001)Google Scholar
  28. 28.
    Russell, B., Torralba, A., Murphy, K., Freeman, W.: Labelme: A database and web-based tool for image annotation. International Journal of Computer Vision 77, 157–173 (2008)CrossRefGoogle Scholar
  29. 29.
    Rubin, D., Rodriguez, C., Shah, P., Beaulieu, C.: ipad: Semantic annotation and markup of radiological images. In: Proc. of Annual American Medical Informatics Association (AMIA) Symposium, Washington, DC, pp. 626–630 (2008)Google Scholar
  30. 30.
    Tsinaraki, C., Polydoros, P., Christodoulakis, S.: Interoperability support between mpeg-7/21 and owl in ds-mirf. IEEE Trans. Knowl. Data Eng. 19, 219–232 (2007)CrossRefGoogle Scholar
  31. 31.
    Troncy, R., Celma, O., Little, S., Garcia, R., Tsinaraki, C.: Mpeg-7 based multimedia ontologies: Interoperability support or interoperability issue? In: Proc. Workshop on Multimedia Annotation and Retrieval enabled by Shared Ontologies (MARESO), Genova, Italy, pp. 2–16 (2007)Google Scholar
  32. 32.
    MPEG-7 XM: MPEG-7 Visual eXperimentation Model (XM), Version 10.0, Doc. N4062. ISO/IEC/JTC1/SC29/WG11 (2001)Google Scholar
  33. 33.
    Rutledge, L.: Smil 2.0: Xml for web multimedia. Internet Computing 5, 78–84 (2001)CrossRefGoogle Scholar
  34. 34.
    Kipp, M.: Anvil - a generic annotation tool for multimodal dialogue. In: Proc. 7th European Conf. on Speech Communication and Technology (Eurospeech), Aalborg, Denmark (2001)Google Scholar
  35. 35.
    Kipp, M.: Spatiotemporal coding in anvil. In: Proc. 6th International Conference on Language Resources and Evaluation (LREC), Marrakech, Morocco (2008)Google Scholar
  36. 36.
    Schallauer, P., Ober, S., Neuschmied, H.: Efficient semantic video annotation by object and shot re-detection. Posters and Demos Session, 2nd International Conference on Semantic and Digital Media Technologies (SAMT), Koblenz, Germany (2008)Google Scholar
  37. 37.
    Schroeter, R., Hunter, J., Kosovic, D.: Vannotea - a collaborative video indexing, annotation and discussion system for broadband networks. In: Proc. of Workshop on Knowledge Markup and Semantic Annotation (K-CAP), Florida, US (2003)Google Scholar
  38. 38.
    Hausenblas, M., Bailer, W., Bürger, T., Troncy, R.: Deploying multimedia metadata on the semantic web. Posters and Demos Session, 2nd International Conference on Semantic and Digital Media Technologies (SAMT), Genoa, Italy (2007)Google Scholar
  39. 39.
    Vacura, M., Svátek, V., Saathoff, C., Ranz, T., Troncy, R.: Describing low-level image features using the comm ontology. In: Proc. 15th International Conference on Image Processing (ICIP), San Diego, California, USA, pp. 49–52 (2008)Google Scholar
  40. 40.
    Bürger, T., Hausenblas, M.: Why real-world multimedia assets fail to enter the semantic web. In: Proc. of the Semantic Authoring, Annotation and Knowledge Markup Workshop (SAAKM), Whistler, British Columbia, Canada (2007)Google Scholar
  41. 41.
    Lagoze, C., Hunter, J.: The abc ontology and model. Journal of Digital Information 2 (2001)Google Scholar
  42. 42.
    Troncy, R., Bailer, W., Hausenblas, M., Hofmair, P., Schlatte, R.: Enabling multimedia metadata interoperability by defining formal semantics of MPEG-7 profiles. In: Avrithis, Y., Kompatsiaris, Y., Staab, S., O’Connor, N.E. (eds.) SAMT 2006. LNCS, vol. 4306, pp. 41–55. Springer, Heidelberg (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Stamatia Dasiopoulou
    • 1
  • Eirini Giannakidou
    • 1
  • Georgios Litos
    • 1
  • Polyxeni Malasioti
    • 1
  • Yiannis Kompatsiaris
    • 1
  1. 1.Multimedia Knowledge Laboratory, Informatics and Telematics InstituteCentre for Research and TechnologyHellasGreece

Personalised recommendations