Skip to main content
Log in

Automatic image annotation using visual content and folksonomies

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Automatic image annotation is an important and challenging task, and becomes increasingly necessary when managing large image collections. This paper describes techniques for automatic image annotation that take advantage of collaboratively annotated image databases, so called visual folksonomies. Our approach applies two techniques based on image analysis: First, classification annotates images with a controlled vocabulary and second tag propagation along visually similar images. The latter propagates user generated, folksonomic annotations and is therefore capable of dealing with an unlimited vocabulary. Experiments with a pool of Flickr images demonstrate the high accuracy and efficiency of the proposed methods in the task of automatic image annotation. Both techniques were applied in the prototypical tag recommender “tagr”.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://www.flickr.com

  2. http://www.techcrunch.com/2007/11/13/2-billion-photos-on-flickr

  3. Flickr gives API-level access to the photos and tags, as long as the pictures are set as public by their owners.

  4. http://www.imgseek.net

  5. http://www.beholdsearch.com

  6. http://labs.systemone.at/retrievr

  7. http://www.alipr.com

  8. http://flickr.com/groups/fruitandveg

  9. http://www.mit.edu/~markaf/projects/wordnet/

  10. http://wordnet.princeton.edu/

  11. http://www.flickr.com

References

  1. Aurnhammer M, Hanappe P, Steels L (2006) Integrating collaborative tagging and emergent semantics for image retrieval. In: Proceedings WWW2006, collaborative web tagging workshop, Southampton, May 2006

  2. Ayache S, Quenot G, Gensel J (2007) Classifier fusion for SVM-based multimedia semantic indexing. In: European conf. on information retrieval, Rome, 2–5 April 2007

  3. Breese JS, Heckerman D, Kadie C (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the 14th annual conference on uncertainty in artificial intelligence, pp 43–52. citeseer.ist.psu.edu/breese98empirical.html

  4. Chang C, Lin CJ (2008) LIBSVM: a library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm

  5. Chakrabarti K, Mehrotra S (1999) The hybrid tree: an index structure for high dimensional feature spaces. In: Proceedings of the 15th international conference on data engineering, Washington, DC, 23–26 March 1999

  6. Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24:603–619

    Article  Google Scholar 

  7. Cusano C, Ciocca G, Schettini R (2003) Image annotation using SVM. In: Proceedings of internet imaging IV, SPIE, Santa Clara, 21–22 January 2003

  8. Dalai N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Conf. on computer vision and pattern recognition, vol 1. IEEE, Piscataway, pp 886–893

    Google Scholar 

  9. Datta R, Li J, Wang J (2005) Content-based image retrieval—approaches and trends of the new age. In: MIR ’05: proceedings of the 7th ACM SIGMM international workshop on multimedia information retrieval. ACM, New York

    Google Scholar 

  10. Fellbaum C (Ed) (1998) WordNet: an electronic lexical database. MIT, Cambridge

    MATH  Google Scholar 

  11. Forsyth D, Ponce J (2002) Computer vision: a modern approach. Prentice Hall, Englewood Cliffs

    Google Scholar 

  12. Golder SA, Hubermann BA (2006) The structure of collaborative tagging systems. J Inf Sci 32/2:198–208

    Article  Google Scholar 

  13. Hardoon DR, Saunders C, Szedmak S, Shawe-Taylor J (2006) A correlation approach for automatic image annotation. Int Conf Adv Data Mining Appl Springer LNAI 4093:681–692

    Article  Google Scholar 

  14. Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: ACM SIGIR conference on research and development in information retrieval, Toronto, pp 119–126

  15. Kern R, Granitzer M, Pammer V (2008) Extending folksonomies for image tagging. In: 9th international workshop on image analysis for multimedia interactive services WIAMIS 2008, Klagenfurt, 7–9 May 2008

  16. Li X, Chen L, Zhang L, Lin F, Ma WY (2006) Image annotation by large-scale content-based image retrieval. In: Proceedings of ACM int. conf. on multimedia, Santa Barbara, 23–27 October 2006

  17. Lindstaedt S, Pammer V, Mörzinger R, Kern R, Mülner H, Wagner C (2008) Recommending tags for pictures based on text, visual content and user context. In: Proceedings of the 3rd international conference on internet and web applications and services (ICIW 2008), Athens, 8–13 June 2008

  18. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  19. Manjunath BS, Ohm J-R, Vasudevan VV, Yamada A (2001) MPEG-7 color and texture descriptors. IEEE Trans Circuits Syst Video Technol 11:703–715, June

    Article  Google Scholar 

  20. Mörzinger R, Thallinger G (2007) TRECVid 2007 high level feature extraction experimetns at JOANNEUM RESEARCH. In: Proceedings of TRECVID workshop, Gaithersburg, 5–6 November 2007

  21. Pammer V, Ley T, Lindstaedts (2008) Waxmann Verlag, chap tagr: Unterstützung in kollaborativen Tagging Umgebungen Durch Semantische Und Assoziative Netzwerke. Medien in der Wissenschaft

  22. Shaw B (2006) Learning from a visual folksonomy. Automatically annotating images from flickr, May

  23. Sigurbjörnsson B, van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th international conference on world wide web, WWW 2008, Beijing, 21–25 April 2008

  24. Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22:1349–1380

    Article  Google Scholar 

  25. Wang X, Zhang L, Jing F, Ma WY (2006) AnnoSearch: image auto-annotation by search. In: Proceedings of the international conference on computer vision and pattern recognition, New York, 17–22 June 2006

  26. Yavlinsky A, Schofield E, Rüger SM (2005) Automated image annotation using global features and robust nonparametric density estimation. In: Proceedings of the 4th int. conf. on image and video retrieval (CIVR), vol 3568, Singapore, July 2005, pp 507–517

  27. Young IT, van Vliet LJ, van Ginkel M (2002) Recursive gabor filtering. In: ICPR ’00: proceedings of the int. conf. on pattern recognition, vol 50(11), November 2002, pp 2798–2805

Download references

Acknowledgements

The authors would like to thank their colleagues Marcus Thaler and Werner Haas for their support and feedback. The Know-Center is funded within the Austrian COMET Program—Competence Centers for Excellent Technologies—under the auspices of the Austrian Ministry of Transport, Innovation and Technology, the Austrian Ministry of Economics and Labor and by the State of Styria. COMET is managed by the Austrian Research Promotion Agency FFG.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roland Mörzinger.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lindstaedt, S., Mörzinger, R., Sorschag, R. et al. Automatic image annotation using visual content and folksonomies. Multimed Tools Appl 42, 97–113 (2009). https://doi.org/10.1007/s11042-008-0247-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-008-0247-7

Keywords

Navigation