Journal of Signal Processing Systems

, Volume 58, Issue 3, pp 387–406 | Cite as

Knowledge Based Image Annotation Refinement

  • Yohan Jin
  • Latifur Khan
  • B. Prabhakaran


Recently, images on the Web and personal computers are prevalent around the human’s life. To retrieve effectively these images, there are many (Automatic Image Annotation) AIA algorithms. However, it still suffers from low-level accuracy since it couldn’t overcome the semantic-gap between low-level features (‘color’, ‘texture’ and ‘shape’) and high-level semantic meanings (e.g., ‘sky’, ‘beach’). Namely, AIA techniques annotates images with many noisy keywords. In this paper, we propose a novel approach that augments the classical model with generic knowledge-based, WordNet. Our novel approach strives to prune irrelevant keywords by the usage of WordNet. To identify irrelevant keywords, we investigate various semantic similarity measures between keywords and finally fuse outcomes of all these measures together to make a final decision using Dempster-Shafer evidence combination. Furthermore, We can re-formulate the removal of erroneous keywords from image annotation problem into graph-partitioning problem, which is weighted MAX-CUT problem. It is possible that we have too many candidate keywords for web-images. Hence, we need to have deterministic polynomial time algorithm for MAX-CUT problem. We show that finding optimal solution for removing noisy keywords in the graph is NP-Complete problem and propose a new methodology for Knowledge Based Image Annotation Refinement (KBIAR) using a deterministic polynomial time algorithm, namely, randomized approximation graph algorithm. Finally, we demonstrate the superiority of this algorithm over traditional one including the most recent work for a benchmark dataset.


Image annotation Image annotation refinement WordNet Semantic-similarity Max-cut algorithm 


Open Access

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.


  1. 1.
    Yohan, J., Khan, L., Wang, L., & Awad, M. (2005) Image annotations by combining multiple evidence and WordNet. In Proceedings the 13th annual ACM international conference on Multimedia (MM05’), Singapore, 706–715.Google Scholar
  2. 2.
    Wang, Y., & Gong, S. (2007). Refining image annotation using contextual relations between words. Proceedings of the 6th ACM international conference on Image and video retrieval, (CIVR 07’), July 9–11, Amsterdam, The Nethelands, 425–432.Google Scholar
  3. 3.
    Wang, C, Jing, F., Zhang, L., & Zhang, H.-J. (2006). Image annotation refinment using random walk with restarts. Proceedings of the 14th annual ACM international conference on Multimedia, MM 06’, October 23–27, Santa Barbara, California, USA. 647–650.Google Scholar
  4. 4.
    Wang, C., Jing, F., Zhang, L., & Zhang, H.-J. (2007). Content-based image annotation refinement. Proceedings of computer vision and pattern recognition. CVPR’07.Google Scholar
  5. 5.
    Liu, J., Li, M., Ma, W.-Y., Liu, Q., & Lu, H. (2006). An adaptive graph model for automatic image annotation. Proceedings of the 8th ACM international workshop on Multimedia information retrieval (MIR 06’), Santa Barbara, California, USA, October 26-27, 61–70.Google Scholar
  6. 6.
    Zhou, X., Wang, M., Zhang, Q., Zhang, J., & Shi, B. (2007). Automatic image annotation by an iterative approach: incorporating keyword correlations and region matching. Proceedings of the 6th ACM international conference on Image and video retrieval, (CIVR 07’), July 9–11, Amsterdam, The Nethelands , 425–432.Google Scholar
  7. 7.
    Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In Proceedings of the 20th International Conference Very Large Data Bases, (VLDB), Santiago, Chile, September.Google Scholar
  8. 8.
    Karp, R. (1972) Reducibility among combinatorial problems. Plenum Press, pages 85–103.Google Scholar
  9. 9.
    Goemans, M. X., & Williamson, D. P. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of ACM, 42, 1115–1145.Google Scholar
  10. 10.
    Toh, K. C., Todd, M. J., & Tutuncu, R. H. (1996). SDPT3– a MATLAB software package for semidefinite programming. Technical Report TR1177, Cornell University.Google Scholar
  11. 11.
    Vazirani, V.V. (2001). Approximation algorithms, Springer.Google Scholar
  12. 12.
    Banerjee, S., & Pedersen, T. (2003) Extended gloss overlaps as a measure of semantic relatedness. In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, 805–810.Google Scholar
  13. 13.
    Duygulu, P., Barnard, K., De Freitas, N., & Forsyth, D. (2002). Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In Proceedings of the Seventh European Conference on Computer Vision (ECCV) Part IV, Copenhagen, Denmark, 97–112.Google Scholar
  14. 14.
    Aslandogan, Y. A. & Yu, C.-T. (2000). Diogenes: A web search agent for content based indexing of personal images. In Proceedings of ACM SIGIR 2000, Athens, Greece, pages 481–482.Google Scholar
  15. 15.
    Jeon, J., Lavrenko, V., & Manmatha, R. (2003). Automatic image annotation and retrieval using cross-media relevance models. Proceedings of the 26th Annual International ACM SIGIR Conference ,Toronto, Canada, 119–126.Google Scholar
  16. 16.
    Jiang, J., & Conrath, D. (1997). Semantic similarity based on corpus statistics and lexical taxonomy. In Procedeeings on International Conference on Research in Computational Linguistics, Taiwan.Google Scholar
  17. 17.
    Kang, F., Jin, R., & Chai, J. Y. (2004). Regularizing translation models for better automatic image annotation. In Proceedings of The Thirteenth Conference on Information and Knowledge Management, 2004, Washington D. C., USA, Nov. 8-13, 350-359.Google Scholar
  18. 18.
    Lavrenko, V. Feng, S. L., & Manmatha (2004). Statistical models for automatic video annotation and retrieval. International Conference on Acoustics, Speech and Signal Processing, (ICASSP) Montreal, QC, Canada, 17–21.Google Scholar
  19. 19.
    Li, J., & Wang, J. Z. Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transaction on Pattern Analysis and Machine Intelligence, 25(9), 1075–1088.Google Scholar
  20. 20.
    Leacock, C., & Chodorow, M. (1998). Combining local context and WordNet similarity for word sense identification. WordNet:An electronic lexical database. In C. Fellbaum (Ed.), MIT Press, 265–283.Google Scholar
  21. 21.
    Lesk, M. (1986). Automatic sense disambiguation machine readable dictionaries: How to tell a pine cone from an ice cream cone. In Proceedings of the 5th annual international conference on Systems documentation, Toronto, Ontario, Canada. ACM Press, NewYork, NY, USA, 24–26.Google Scholar
  22. 22.
    Lin, D. (1997). Using syntactic dependency as a local context to resolve word sense ambiguity. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics. Madrid, Spain, 64–71.Google Scholar
  23. 23.
    Mori, Y., Takahashi, H., & Oka, R. (1999). Image-to-word transformation based on dividing and vector quantizing images with words. In MISRM’99 Frist International Workshop on Multimedia Intelligent Storage and Retrieval Management, Orlando, Florida.Google Scholar
  24. 24.
    Cilibrasi, R. L., & Vitanyi, P. M. B. (2007). The Google similarity distance. IEEE Transactions on Knowledge and Data Engineering, 19(3).Google Scholar
  25. 25.
    Miller, G. Beckwith, R. Fellbaum, C. Gross, D. & Miller, K. (1990). WordNet: an on-line lexical database. International Journal of Lexicography, 3(4), 235–244.CrossRefGoogle Scholar
  26. 26.
    Pan, J. Y., Yang, H. J., Faloutsos, C., & Duygulu, P. (2004). Automatic multimedia cross-modal correlation discovery. In Proceedings of the 10th ACM SIGKDD Conference KDD 2004. Seattle, WA, 653–658.Google Scholar
  27. 27.
    Carneiro, G., & Vasconcelos, N. (2005). A database centric view of semantic image annotation and retrieval. In Proceedings of the 28th Annual international ACM SIGIR Conference on Research and Development in information Retrieval. Salvador, Brazil, 2005.Google Scholar
  28. 28.
    Carneiro, G., & Vasconcelos, N. (2005). Formulating semantic image annotation s a supervised learning problem. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR05’).Google Scholar
  29. 29.
    Resnik, P. (1995). Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, 448–453.Google Scholar
  30. 30.
    Shi, J., & Malik, J. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905.Google Scholar
  31. 31.
    Chang, E. Kingshy, G. Sychay, G. & Wu, G. (2003). CBSA:content-based soft annotation for multimodal image retrieval using Bayes point machines. IEEE Trans on CSVT, 13(1), 26–28.Google Scholar
  32. 32.
    Cusano, C., Ciocca, G., & Schettini, R. (2004). Image annotation using SVM. In Proceedings of internet imaging IV, Vol. SPIE 5304.Google Scholar
  33. 33.
    Yohan Jin, Kibum Jin, L. Khan, & B. Prabhakaran (2008). The randomized approximating graph algorithm for image annotation refinement problem. Intl. Conf. on Computer Vision (CVPR). Workshop on Semantic Learning Application in Multimedia.Google Scholar
  34. 34.
    Y. Gao, J. Fan, H. Luo, X. Xue, & R. Jain (2006). Automatic image annotation by incorporating feature hierarchy and boosting to scale up SVM classifiers. In Proceedings of the 14th Annual ACM International Conference on Multimedia (Santa Barbara, CA, USA, October 23–27).Google Scholar
  35. 35.
    Helmberg, C. Rendl, F. Vanderbei, R. & Wolkowicz, H. (1996). An integer-point method for semidefinite programming. SIAM Journal on Optimization, 6, 342–361.zbMATHCrossRefMathSciNetGoogle Scholar
  36. 36.
    G. Shafer (1976). A mathematical theory of evidence. Princeton University Press.Google Scholar
  37. 37.
    C. Yang, M. Dong, & J. Hua (2006) Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June 17–22.Google Scholar
  38. 38.
    Fernandez de la Vega, W., & Karpinski, M. (1998). Polynomial time approximation of dense weighted instances of MAX-CUT. Technical Report TR98-064, ECCC, to appear in Randomized Structures & Algorithms.Google Scholar
  39. 39.
    Fernandez de la Vega, W., & Kenyon, C. (1998). A randomized approximation scheme for metric MAX-CUT’, Proc. 39th Ann. IEEE Symp. on Foundations of Comput. Sci., IEEE Computer Society, 468–471.Google Scholar
  40. 40.
    Ausiello, G., & Crescenzi, P. Complexity and approximation. SpringerGoogle Scholar
  41. 41.
    Crescenzi, P., Silvestri, R., & Trevisan, L. (1996). To weight or not to weight: Where is the question?’, Proc. 4th Israel Symp. on Theory of Computing and Systems, IEEE Computer Society, 68–77.Google Scholar
  42. 42.
    Sahni, S. & Gonzales, T. (1976). P-complete approximation problems. Journal of the ACM, 23, 555–565.zbMATHCrossRefGoogle Scholar

Copyright information

© The Author(s) 2009

Authors and Affiliations

  1. 1.Data Mining Team, MySpace (Fox Interactive Media)Beverly HillsUSA
  2. 2.Department of Computer ScienceUniversity of Texas at DallasRichardsonUSA

Personalised recommendations