Multimedia Systems

, Volume 17, Issue 2, pp 135–148 | Cite as

A three-level architecture for bridging the image semantic gap

Original Research


Image retrieval systems face the problem of dealing with the different ways to apprehend the content of images and in particular the difficulty to characterize the visual semantics. To address this issue, we examine the use of three abstract levels of representation, namely Signal, Object and Semantic. At the Signal Level, we propose a framework mapping the extracted low-level features to symbolic signal descriptors. The Object Level features a statistical model considering the joint distribution of object concepts (such as mountains, sky…) and the symbolic signal descriptors. At the Semantic Level, signal and object characterizations are coupled within a logic-based framework. The latter is instantiated by a knowledge representation formalism allowing to define an expressive query language consisting of several boolean and quantification operators. Our architecture therefore makes it possible to process topic-based queries. Experimentally, we evaluate our theoretical proposition on a corpus of real-world photographs and the TRECVid corpus.


Multimedia processing Semantic gap Image indexing and retrieval Experimental evaluation 


  1. 1.
    Smeulders, A., et al.: Content-based image retrieval at the end of the early years. IEEE PAMI 22(12), 1349–1380 (2000)Google Scholar
  2. 2.
    Mojsilovic, A., Rogowitz, B.: Capturing image semantics with low-level descriptors. ICIP, pp.18–21 (2001)Google Scholar
  3. 3.
    Blei, D.M., Jordan, M.I.: Modeling annotated data. SIGIR, pp. 127–134 (2003)Google Scholar
  4. 4.
    Carneiro, G., Chan, A.B., Moreno, P.J., Vasconcelos, N.: Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans. Pattern. Anal. Mach. Intell. 29(3), 394–410 (2007)Google Scholar
  5. 5.
    Feng, S., Manmatha, R., Lavrenko, V.: Multiple Bernoulli relevance models for image and video annotation. CVPR 2, 1002–1009 (2004)Google Scholar
  6. 6.
    Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. SIGIR, pp. 119–126 (2003)Google Scholar
  7. 7.
    Li, J., Wang, J.Z.: Real-time computerized annotation of pictures. IEEE PAMI 30(6), 985–1002 (2008)Google Scholar
  8. 8.
    Liu, J., et al.: Dual cross-media relevance model for image annotation. ACM MM, pp. 605–614 (2007)Google Scholar
  9. 9.
    Jin, Y., et al.: Image annotations by combining multiple evidence and wordNet. ACM MM, pp. 706–715 (2005)Google Scholar
  10. 10.
    Srikanth, M. et al.: Exploiting Ontologies for Automatic Image Annotation. ACM SIGIR, pp. 1349–1380 (2005)Google Scholar
  11. 11.
    Bradshaw, B.: Semantic based image retrieval: a probabilistic approach. ACM MM, pp. 167–176 (2000)Google Scholar
  12. 12.
    Lim, J., Jin, J.S.: A structured learning framework for content-based image indexing and visual query. Multimed. Syst. 10(4), 317–331 (2005)CrossRefGoogle Scholar
  13. 13.
    Town, C.P., Sinclair, D.: CBIR Using Semantic Visual Categories. TR2000-14, AT&T Labs Cambridge (2000)Google Scholar
  14. 14.
    Mulhem, P., et al.: Advances in Digital Home Image Albums. Multimedia Systems and Content-Based Image Retrieval, Idea Publishing, chapter IX, pp. 201–226 (2003)Google Scholar
  15. 15.
    Mechkour, M.: EMIR2: An Extended Model for Image Representation and Retrieval. DEXA, pp. 395–404 (1995)Google Scholar
  16. 16.
    Meghini, C., et al.: A model of multimedia information retrieval. J. ACM 48(5), 909–970 (2001)CrossRefMathSciNetGoogle Scholar
  17. 17.
    Berlin, B., Kay, P.: Basic Color Terms. Their Universality and Evolution. UC Press, Berkeley (1991)Google Scholar
  18. 18.
    Bhushan, N., et al.: The texture lexicon: understanding the categorization of visual texture terms and their relationship to texture images. Cogn. Sci. 21(2), 219–246 (1997)CrossRefGoogle Scholar
  19. 19.
    Peters, S., Westerthal, D.: Quantifiers. MIT Press, Cambridge, MA (2002)Google Scholar
  20. 20.
    Kender, J.R., et al.: IBM Research TRECVID Video Retrieval System. In: Online Proceedings of the TREC Video Retrieval Evaluation.
  21. 21.
    Ianeva, T., et al.: Probabilistic approaches to video retrieval. TREC video retrieval evaluation online proceedings.
  22. 22.
    Gong, Y., et al.: Image indexing and retrieval based on color histograms. Multimed. Tools App. II, 133–156 (1996)Google Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  1. 1.CNRS, University of LyonLyonFrance

Personalised recommendations