Adaptive Model for Integrating Different Types of Associated Texts for Automated Annotation of Web Images

  • Hongtao Xu
  • Xiangdong Zhou
  • Lan Lin
  • Mei Wang
  • Tat-Seng Chua
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5371)


A lot of texts are associated with Web images, such as image file name, ALT texts, surrounding texts etc on the corresponding Web pages. It is well known that the semantics of Web images are well correlated with these associated texts, and thus they can be used to infer the semantics of Web images. However, different types of associated texts may play different roles in deriving the semantics of Web contents. Most previous work either regard the associated texts as a whole, or assign fixed weights to different types of associated texts according to some prior knowledge or heuristics. In this paper, we propose a novel linear basic expansion-based approach to automatically annotate Web images based on their associated texts. In particular, we adaptively model the semantic contributions of different types of associated texts by using a piecewise penalty weighted regression model. We also demonstrate that we can leverage the social tagging data of Web images, such as the Flickr’s Related Tags, to enhance the performance of Web image annotation. Experiments conducted on a real Web image data set demonstrate that our approach can significantly improve the performance of Web image annotation.


Image annotation adaptive model image content analysis 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
  3. 3.
    Blei, D., Jordan, M.: Modeling annotated data. SIGIR, 127–134 (2003)Google Scholar
  4. 4.
    Carneiro, G., Chan, A., Moreno, P., Vasconcelos, N.: Supervised learning of semantic classes for image annotation and retrieval. PAMI (2007)Google Scholar
  5. 5.
    Chang, E., et al.: Cbsa: Content-based soft annotation for multimodal image retrieval using bayes point machines. CirSysVideo 13(1), 26–38 (2003)Google Scholar
  6. 6.
    Christiane, F.: Wordnet: An electronic lexical database. MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  7. 7.
    Duygulu, P., Barnard, K., de Freitas, J., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  8. 8.
    Feng, H., Shi, R., Chua, T.-S.: A bootstrapping framework for annotating and retrieving www images. ACM Multimedia, 960–967 (2004)Google Scholar
  9. 9.
    Feng, S., Manmatha, R., Lavrenko, V.: Multiple bernoulli relevance models for image and video annotation. In: CVPR, pp. 1002–1009 (2004)Google Scholar
  10. 10.
    Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. SIGIR, 119–126 (2003)Google Scholar
  11. 11.
    Li, J., Wang, J.: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. on Pattern Analysis and Machine Intelligence 25(19), 1075–1088 (2003)Google Scholar
  12. 12.
    Li, X., Chen, L., Zhang, L., Lin, F., Ma, W.-Y.: Image annotation by large-scale content-based image retrieval. ACM Multimedia, 607–610 (2006)Google Scholar
  13. 13.
    Mori, Y., Takahashi, H., Oka, R.: Image-to-word transformation based on dividing and vector quantizing images with words. MISRM (1999)Google Scholar
  14. 14.
    Ricardo, B., Berthier, R.: Modern information retrieval. ACM Press, New York (1999)Google Scholar
  15. 15.
    Rui, X., Li, M., Li, Z., Ma, W.-Y., Yu, N.: Bipartite graph reinforcement model for web image annotation. ACM Multimedia, 585–594 (2007)Google Scholar
  16. 16.
    Sanderson, H., Dunlop, M.: Image retrieval by hypertext links. SIGIR, 296–303 (1997)Google Scholar
  17. 17.
    Shen, H., Qoi, B., Tan, K.: Giving meaning to web images. ACM Multimedia, 39–47 (2000)Google Scholar
  18. 18.
    Tang, J., Hua, X.-S., Qi, G.-J., Wang, M., Mei, T., Wu, X.: Structure-sensitive manifold ranking for video concept detection. ACM MM, 23–29 (2007)Google Scholar
  19. 19.
    Tseng, V., Su, J., Wang, B., Lin, Y.: Web image annotation by fusing visual features and textual information. In: SAC, pp. 1056–1060 (2007)Google Scholar
  20. 20.
    Yang, C., Dong, M.: Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning. In: CVPR, pp. 2057–2063 (2006)Google Scholar
  21. 21.
    Zhou, X., Wang, M., Zhang, Q., Zhang, J., Shi, B.: Automatic image annotation by an iterative approach:incorporating keyword correlations and region matching. In: CIVR, pp. 25–32 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Hongtao Xu
    • 1
  • Xiangdong Zhou
    • 1
    • 2
  • Lan Lin
    • 3
  • Mei Wang
    • 2
  • Tat-Seng Chua
    • 2
  1. 1.School of Computer ScienceFudan UniversityShanghaiChina
  2. 2.National University of SingaporeSingapore
  3. 3.Tongji UniversityShanghaiChina

Personalised recommendations