Semantically Relevant Image Retrieval by Combining Image and Linguistic Analysis

  • Tony Lam
  • Rahul Singh
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4292)


In this paper, we introduce a novel approach to image-based information retrieval by combining image analysis with linguistic analysis of associated annotation information. While numerous Content Based Image Retrieval (CBIR) systems exist, most of them are constrained to use images as the only source of information. In contrast, recent research, especially in the area of web-search has also used techniques that rely purely on textual information associated with an image. The proposed research adopts a conceptually different philosophy. It utilizes the information at both the image and annotation level, if it detects a strong semantic coherence between them. Otherwise, depending on the quality of information available, either of the media is selected to execute the search. Semantic similarity is defined through the use of linguistic relationships in WordNet as well as through shape, texture, and color. Our investigations lead to results that are of significance in designing multimedia information retrieval systems. These include technical details on designing cross-media retrieval strategies as well as the conclusion that combining information modalities during retrieval not only leads to more semantically relevant performance but can also help capture highly complex issues such as the emergent semantics associated with images.


Image Retrieval Similarity Threshold Image Annotation Linguistic Analysis Image Retrieval System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aslandogan, Y., Their, C., Yu, C., Zou, J., Rishe, N.: Using Semantic Contents and WordNet in Image Retrieval. In: Proceedings of ACM SIGIR Conference, Philadelphia (July 1997)Google Scholar
  2. 2.
    Barnard, K., Forsyth, D.: Learning the Semantics of Words and Pictures. In: International Conference on Computer Vision, vol. 2, pp. 408–415 (2001)Google Scholar
  3. 3.
    Carson, C., Belonge, S., Greenspan, H., Malik, J.: Blobworld: Image segmentation using Expectation-Maximization and its application to image querying. IEEE Transactions on Pattern Analysis and Machine Intelligence, SUBGoogle Scholar
  4. 4.
    Chen, F., Gargi, U., Niles, L., Schütze, H.: Multi-modal browsing of images in web documents. In: Proc. SPIE Document Recognition and Retrieval (1999)Google Scholar
  5. 5.
    La Cascia, M., Sethi, S., Sclaroff, S.: Combining Textual and Visual Cues for Content-based Image Retrieval on the World Wide Web. In: IEEE Workshop on Content-based Access of Image and Video LibrariesGoogle Scholar
  6. 6.
    Deng, C., He, X., Li, Z., Ma, W., Wen, J.: Hierarchical Clustering of WWW Image Search Results Using Visual, Textual and Link Information. In: Proceedings of the 12th annual ACM international conference on Multimedia, pp. 952–959 (2004)Google Scholar
  7. 7.
    Deng, Y., Manjunath, B., Kenney, C., Moore, M., Shin, H.: An Efficient Color Representation for Image Retrieval. IEEE Transactions on Image Processing 10(1), 140–147 (2001)MATHCrossRefGoogle Scholar
  8. 8.
  9. 9.
    Google search engine,
  10. 10.
    Jacobs, C., Finkelstein, A., Salesin, D.: Fast Multiresolution Image Querying. In: Proceedings of Computer Graphics, Annual Conference Series, pp. 277–286 (1995)Google Scholar
  11. 11.
    Miller, G., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to WordNet: An on-line lexical database. International Journal of Lexicography 3(4), 235–312 (1990)CrossRefGoogle Scholar
  12. 12.
    Paek, S., Sable, C.L., Hatzivassiloglou, V., Jaimes, A., Schiffman, B.H., Chang, S.-F., McKeown, K.R.: Integration of Visual and Text based Approaches for the Content Labeling and 21 Classification of Photographs. In: ACM SIGIR 1999 Workshop on Multimedia Indexing and Retrieval (1999)Google Scholar
  13. 13.
    Rodden, K., Basalaj, W., Sinclair, D., Wood, K.R.: Does organisation by similarity assist image browsing? In: Proceedings of Human Factors in Computing Systems (2001)Google Scholar
  14. 14.
    Sable, C., Hatzivassiloglou, V.: Text-based approaches for the categorization of images. In: Abiteboul, S., Vercoustre, A.-M. (eds.) ECDL 1999. LNCS, vol. 1696, pp. 19–38. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  15. 15.
    Santini, S., Gupta, A., Jain, R.: Emergent Semantics Through Interaction in Image Databases. Knowledge and Data Engineering 13(3), 337–351 (2001)CrossRefGoogle Scholar
  16. 16.
    Sclaroff, S., Taycher, L., La Cascia, M.: ImageRover: A Content-Based Image Browser for the World Wide Web. In: IEEE Workshop on Content-based Access of Image and Video Libraries, TR97-005 06/97Google Scholar
  17. 17.
    Wang, J., Li, J., Wiederhold, G.: Semantics-Sensitive Integrated Matching for Picture Libraries. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(9), 947–963 (2001)CrossRefGoogle Scholar
  18. 18.
    Wang, J., Wiederhold, G., Firschein, O., Wei, S.: Content-based image indexing and searching using daubechies’ wavelets. International Journal of Digital Libraries 1(4), 311–328 (1998)CrossRefGoogle Scholar
  19. 19.
    Yee, K., Swearingen, K., Li, K., Heart, M.: Faceted Metadata for Image Search and Browsing. In: Proceedings of the Conference on Human Factors in Computing Systems, pp. 401–408 (2003)Google Scholar
  20. 20.
    Zambrano, B., Singh, R., Bhattarai, B.: Using Linguistic Models for Image Retrieval. In: Bebis, G., Boyle, R., Koracin, D., Parvin, B. (eds.) ISVC 2005. LNCS, vol. 3804, pp. 494–501. Springer, Heidelberg (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Tony Lam
    • 1
  • Rahul Singh
    • 1
  1. 1.Department of Computer ScienceSan Francisco State UniversitySan Francisco

Personalised recommendations