Multimedia Systems

, Volume 13, Issue 5–6, pp 309–322 | Cite as

Semantic interactive image retrieval combining visual and conceptual content description

Regular Paper


We address the challenge of semantic gap reduction for image retrieval through an improved support vector machines (SVM)-based active relevance feedback framework, together with a hybrid visual and conceptual content representation and retrieval. We introduce a new feature vector based on projecting the keywords associated to an image on a set of “key concepts” with the help of an external lexical database. We then put forward two improvements of SVM-based relevance feedback method. First, to optimize the transfer of information between the user and the system, we introduce a new active learning selection criterion that minimizes redundancy between the candidate images shown to the user. Second, as most image classes span a wide range of scales in the description space, we argue that the insensitivity of the SVM to the scale of the data is desirable in this context and we show how to obtain it by using specific kernel functions. Experimental evaluations show that the joint use of the new concept-based feature vector and the visual features with our relevance feedback scheme can significantly improve the quality of the results.


Cross-modal image retrieval Relevance feedback Active learning Semantic indexing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adams W.H., Iyengar G., Lin C.Y., Naphade M.R., Neti C., Nock H.J. and Smith J.R. (2003). Semantic indexing of multimedia content using visual, audio and text cues. EURASIP J. Appl. Signal Process. 3(2): 170–185 CrossRefGoogle Scholar
  2. 2.
    Berg C., Christensen J.P.R. and Ressel P. (1984). Harmonic Analysis on Semigroups. Springer, Heidelberg MATHGoogle Scholar
  3. 3.
    del Bimbo, A.: Visual Information Retrieval. Morgan Kaufmann (1999)Google Scholar
  4. 4.
    Boujemaa, N., Fauqueur, J., Ferecatu, M., Fleuret, F., Gouet, V., Saux, B.L., Sahbi, H.: Ikona: Interactive generic and specific image retrieval. In: Proceedings of the International Workshop on Multimedia Content-Based Indexing and Retrieval (MMCBIR’2001) (2001)Google Scholar
  5. 5.
    Brinker, K.: Incorporating diversity in active learning with support vector machines. In: Proceedings of ICML-04, International Conference on Machine Learning, pp. 59–66 (2003)Google Scholar
  6. 6.
    Budanitsky, A., Hirst, G.: Semantic distance in wordnet: An experimental, application-oriented evaluation of five measures. In: Proceedings of the Workshop on WordNet and Other Lexical Resources NAACL 2001 (2001)Google Scholar
  7. 7.
    Campbell, C., Cristianini, N., Smola, A.: Query learning with large margin classifiers. In: Proceedings of ICML-00, 17th International Conference on Machine Learning, pp. 111–118. Morgan Kaufmann (2000)Google Scholar
  8. 8.
    Chang, E.Y., Li, B., Wu, G., Goh, K.: Statistical learning for effective visual image retrieval. In: Proceedings of the IEEE International Conference on Image Processing (ICIP’03), pp. 609–612 (2003)Google Scholar
  9. 9.
    Chapelle O., Haffner P. and Vapnik V.N. (1999). Support-vector machines for histogram-based image classification. IEEE Trans. Neural Netw. 10(5): 1055–1064 CrossRefGoogle Scholar
  10. 10.
    Cohn D.A., Ghahramani Z. and Jordan M.I. (1996). Active learning with statistical models. J. Artif. Intell. Res. 4: 129–145 MATHGoogle Scholar
  11. 11.
    Cox I.J., Miller M.L., Minka T.P., Papathomas T. and Yianilos P.N. (2000). The Bayesian image retrieval system, PicHunter: theory, implementation and psychophysical experiments. IEEE Trans. Image Process. 9(1): 20–37 CrossRefGoogle Scholar
  12. 12.
    Cox, I.J., Miller, M.L., Omohundro, S.M., Yianilos, P.N.: An optimized interaction strategy for Bayesian relevance feedback. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 553–558. IEEE Computer Society (1998)Google Scholar
  13. 13.
    Crucianu, M., Tarel, J.P., Ferecatu, M.: A comparison of user strategies in image retrieval with relevance feedback. In: Proceedings of the 7th International Workshop on Audio–Visual Content and Information Visualization in Digital Libraries (AVIVDiLib’05), pp. 121–130 (2005)Google Scholar
  14. 14.
    Duygulu, P., Barnard, K., de Freitas, J.F.G., Forsyth, D.A.: Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Proceedings of the 7th European Conference on Computer Vision-Part IV, pp. 97–112. Springer, Heidelberg (2002)Google Scholar
  15. 15.
    Fellbaum, C., Miller, G (eds.).: WordNet: an Electronic Lexical Database. The MIT Press (1998)Google Scholar
  16. 16.
    Ferecatu, M.: Image retrieval with active relevance feedback using both visual and keyword-based descriptors. Ph.D. thesis, INRIA—Université de Versailles Saint Quentin en Yvelines, France (2005)Google Scholar
  17. 17.
    Ferecatu, M., Crucianu, M., Boujemaa, N.: Retrieval of difficult image classes using svm-based relevance feedback. In: Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval, pp. 23–30 (2004)Google Scholar
  18. 18.
    Fleuret, F., Sahbi, H.: Scale-invariance of support vector machines based on the triangular kernel. In: 3rd International Workshop on Statistical and Computational Theories of Vision (2003)Google Scholar
  19. 19.
    Goh, K., Chang, E., Lai, W.: Multimodal concept-dependent active learning for image retrieval. In: ACM International Conference on Multimedia 2004, pp. 564–571 (2004)Google Scholar
  20. 20.
    Gonzalo, J., Verdejo, F., Chugur, I., Cigarran, J.: Indexing with wordnet synsets can improve text retrieval. In: Proceedings of the COLING/ACL 1998 Workshop on Usage of WordNet for Natural Language Processing, pp. 38–44 (1998)Google Scholar
  21. 21.
    Herbrich R., Graepel T. and Campbell C. (2001). Bayes point machines. J. Mach. Learning Res. 1: 245–279 MATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22th International Conference on Research and Development in Information Retrieval (SIGIR’99), pp. 50–57 (1999)Google Scholar
  23. 23.
    Kherfi, M., Brahmi, D., Ziou, D.: Combining visual features with semantics for a more effective image retrieval. In: Proceedings of the 17th International Conference on Pattern Recognition (2004)Google Scholar
  24. 24.
    La Cascia, M., Sethi, S., Sclaroff, S.: Combining textual and visual cues for content-based image retrieval on the world wide web. In: IEEE Workshop on Content-Based Access of Image and Video Libraries, pp. 24–28 (1998)Google Scholar
  25. 25.
    Leacock C., Chodorow M. and Miller G.A. (1998). Using corpus statistics and WordNet relations for sense identification. Comput. Linguist. 24(1): 147–165 Google Scholar
  26. 26.
    Lenat D. (1995). Cyc: a large-scale investment in knowledge infrastructure. Commun. ACM 38(11): 33–38 CrossRefGoogle Scholar
  27. 27.
    Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the 15th International Conference on Machine Learning, pp. 296–304 (1998)Google Scholar
  28. 28.
    Liu H. and Singh P. (2004). Conceptnet: a practical commonsense reasoning tool-kit. BT Technol. J. 22(4): 211–226 CrossRefGoogle Scholar
  29. 29.
    Lu, Y., Hu, C., Zhu, X., Zhang, H.J., Yang, Q.: A unified framework for semantics and feature based relevance feedback in image retrieval systems. In: Proceedings of the 8th ACM International Conference on Multimedia, pp. 31–37. ACM Press (2000)Google Scholar
  30. 30.
    Mihalcea, R., Moldovan, D.: Semantic indexing using wordnet senses. In: Proceedings of ACL Workshop on IR and NLP (2000)Google Scholar
  31. 31.
    Resnik P. (1995). Using information content to evaluate semantic similarity in a taxonomy. In: Mellish, C.S. (eds) Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence., pp 448–453. Morgan Kaufmann, San Mateo Google Scholar
  32. 32.
    Schölkopf B. (2000). The kernel trick for distances. Adv. Neural Inf. Process. Systems 12: 301–307 Google Scholar
  33. 33.
    Schölkopf, B., Smola, A.: Learning with Kernels. MIT Press (2002)Google Scholar
  34. 34.
    Seydoux, F., Chappelier, J.C.: Semantic indexing using minimum redundancy cut in ontologies. In: Proceedings of International Conference on Recent Advances in Natural Language Processing (RANLP 2005), pp. 486–492 (2005)Google Scholar
  35. 35.
    Singh, P., Lin, T., Mueller, E., Lim, G., Perkins, T., Zhu, W.: Open mind commonsense: knowledge acquisition from the general public. In: Proceedings of the First International Conference on Ontologies, Databases, and Applications of Semantics for Large Scale Information Systems (2002)Google Scholar
  36. 36.
    Smeulders A., Worring M., Santini S., Gupta A. and Jain R. (2000). Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22(12): 1349–1380 CrossRefGoogle Scholar
  37. 37.
    Smith, J.R., Basu, S., Lin, C.Y., Naphade, M.R., Tseng, B.: Integrating features, models and semantics for content-based retrieval. In: Proceedings of the International Workshop on MultiMedia Content-Based Indexing and Retrieval (MMCBIR’01), pp. 95–98 (2001)Google Scholar
  38. 38.
    Tong, S., Chang, E.: Support vector machine active learning for image retrieval. In: Proceedings of the 9th ACM International Conference on Multimedia, pp. 107–118. ACM Press (2001)Google Scholar
  39. 39.
    Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. In: Proceedings of ICML-00, 17th International Conference on Machine Learning, pp. 999–1006. Morgan Kaufmann (2000)Google Scholar
  40. 40.
    Wu, Z., Palmer, M.: Verb semantics and lexical selection. In: 32nd Annual Meeting of the Association for Computational Linguistics, pp. 133–138. New Mexico State University, Las Cruces, New Mexico (1994)Google Scholar
  41. 41.
    Zhang C. and Chen T. (2002). An active learning framework for content-based information retrieval. IEEE Trans. Multimedia 4(2): 260–268 CrossRefGoogle Scholar
  42. 42.
    Zhang, H.J., Su, Z.: Improving CBIR by semantic propagation and cross-mode query expansion. In: Proceedings of the International Workshop on MultiMedia Content-Based Indexing and Retrieval (MMCBIR’01), pp. 83–86 (2001)Google Scholar
  43. 43.
    Zhang, R., Zhang, Z.M., Li, M., Ma, W.Y., Zhang, H.J.: A probabilistic semantic model for image annotation and multi-modal image retrieval. In: Proceedings of the 2005 IEEE International Conference on Computer Vision (ICCV’05) (2005)Google Scholar
  44. 44.
    Zhao R. and Grosky W.I. (2002). Narrowing the semantic gap—improved text based web document retrieval using visual features. IEEE Trans. Multimedia 4(2): 189–200 CrossRefGoogle Scholar
  45. 45.
    Zhou X.S. and Huang T.S. (2002). Unifying keywords and visual contents in image retrieval. IEEE Multimedia 9(2): 23–33 CrossRefMathSciNetGoogle Scholar
  46. 46.
    Zhou X.S. and Huang T.S. (2003). Relevance feedback for image retrieval: a comprehensive review. Multimedia Systems 8(6): 536–544 CrossRefGoogle Scholar
  47. 47.
    Zhou, Z.H., Chen, K.J., Jiang, Y.: Exploiting unlabeled data in content-based image retrieval. In: Proceedings of the 15th European Conference on Machine Learning (ECML’04), pp. 525–536 (2004)Google Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  • Marin Ferecatu
    • 1
  • Nozha Boujemaa
    • 1
  • Michel Crucianu
    • 2
  1. 1.INRIA Rocquencourt, IMEDIA TeamLe Chesnay CedexFrance
  2. 2.CNAM ParisParis Cedex 03France

Personalised recommendations