Abstract
We examine whether a traditional automated annotation system can be improved by using background knowledge. Traditional means any machine learning approach together with image analysis techniques. We use as a baseline for our experiments the work done by Yavlinsky et al. [1] who deployed non-parametric density estimation. We observe that probabilistic image analysis by itself is not enough to describe the rich semantics of an image. Our hypothesis is that more accurate annotations can be produced by introducing additional knowledge in the form of statistical co-occurrence of terms. This is provided by the context of images that otherwise independent keyword generation would miss. We test our algorithm with two different datasets: Corel 5k and ImageCLEF 2008. For the Corel 5k dataset, we obtain significantly better results while our algorithm appears in the top quartile of all methods submitted in ImageCLEF 2008.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Yavlinsky, A., Schofield, E., Rüger, S.: Automated image annotation using global features and robust nonparametric density estimation. In: Leow, W.-K., Lew, M., Chua, T.-S., Ma, W.-Y., Chaisorn, L., Bakker, E.M. (eds.) CIVR 2005. LNCS, vol. 3568, pp. 507–517. Springer, Heidelberg (2005)
Duygulu, P., Barnard, K., de Freitas, J.F.G., Forsyth, D.A.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)
Melamed, I.D.: Empirical methods for exploiting parallel texts. PhD thesis, University of Pennsylvania (1998)
Jin, R., Chai, J.Y., Si, L.: Effective automatic image annotation via a coherent language model and active learning. In: Proceedings of the 12th International ACM Conferencia on Multimedia, pp. 892–899 (2004)
Mori, Y., Takahashi, H., Oka, R.: Image-to-word transformation based on dividing and vector quantizing images with words. In: International Workshop on Multimedia Intelligent Storage and Retrieval Management (1999)
Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D., Jordan, M.: Matching words and pictures. Journal of Machine Learning Research 3, 1107–1135 (2003)
Jin, Y., Khan, L., Wang, L., Awad, M.: Image annotations by combining multiple evidence & WordNet. In: Proceedings of the 13th International ACM Conference on Multimedia, pp. 706–715 (2005)
Liu, J., Li, M., Ma, W.Y., Liu, Q., Lu, H.: An adaptive graph model for automatic image annotation. In: Proceedings of the 8th ACM international workshop on Multimedia information retrieval, pp. 61–70 (2006)
Zhou, X., Wang, M., Zhang, Q., Zhang, J., Shi, B.: Automatic image annotation by an iterative approach: incorporating keyword correlations and region matching. In: Proceedings of the International ACM Conference on Image and Video Retrieval, pp. 25–32 (2007)
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of International ACM Conference on Research and Development in Information Retrieval, pp. 119–126 (2003)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Escalante, H.J., Montes, M., Sucar, L.E.: Word co-occurrence and Markov Random Fields for improving automatic image annotation. In: Proceedings of the 18th British Machine Vision Conference (2007)
Tollari, S., Detyniecki, M., Fakeri-Tabrizi, A., Amini, M.R., Gallinari, P.: UPMC/LIP6 at ImageCLEFphoto 2008: On the exploitation of visual concepts (VCDT). In: Evaluating Systems for Multilingual and Multimodal Information Access – 9th Workshop of the Cross-Language Evaluation Forum (2008)
Deselaers, T., Hanbury, A.: The visual concept detection task in ImageCLEF 2008. In: Evaluating Systems for Multilingual and Multimodal Information Access – 9th Workshop of the Cross-Language Evaluation Forum (2008)
Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Journal of Language and Cognitive Processes 6, 1–28 (1991)
Manning, C.D., Schütze, H.: Foundations of statistical natural language processing. MIT Press, Cambridge (1999)
Makadia, A., Pavlovic, V., Kumar, S.: A new baseline for image annotation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302. Springer, Heidelberg (2008)
Fawcett, T.: An introduction to ROC analysis. Pattern Recognition Letters 27(8), 861–874 (2006)
Hauptmann, A., Yan, R., Lin, W.H.: How many high-level concepts will fill the semantic gap in news video retrieval? In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, pp. 627–634 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Llorente, A., Rüger, S. (2009). Using Second Order Statistics to Enhance Automated Image Annotation. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds) Advances in Information Retrieval. ECIR 2009. Lecture Notes in Computer Science, vol 5478. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00958-7_52
Download citation
DOI: https://doi.org/10.1007/978-3-642-00958-7_52
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00957-0
Online ISBN: 978-3-642-00958-7
eBook Packages: Computer ScienceComputer Science (R0)