Multimedia Tools and Applications

, Volume 56, Issue 1, pp 35–62 | Cite as

Leveraging community metadata for multimodal image ranking

  • Fabian RichterEmail author
  • Stefan Romberg
  • Eva Hörster
  • Rainer Lienhart


Searching for relevant images given a query term is an important task in nowadays large-scale community databases. The image ranking approach presented in this work represents an image collection as a graph that is built using a multimodal similarity measure based on visual features and user tags. We perform a random walk on this graph to find the most common images. Further we discuss several scalability issues of the proposed approach and show how in this framework queries can be answered fast. Experimental results validate the effectiveness of the presented algorithm.


Image ranking Image retrieval PageRank Graph 


  1. 1.
    Berg TL, Berg AC (2009) Finding iconic images. In: The 2nd internet vision workshop at IEEE CVPRGoogle Scholar
  2. 2.
    Crandall D, Backstrom L, Huttenlocher D, Kleinberg J (2009) Mapping the world’s photos. In: Proc. 18th international world wide web conferenceGoogle Scholar
  3. 3.
    Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc 39:1–38zbMATHMathSciNetGoogle Scholar
  4. 4.
    Grangier D, Bengio S (2008) A discriminative kernel-based approach to rank images from text queries. IEEE Trans Pattern Anal Mach Intell 30(8):1371–1384CrossRefGoogle Scholar
  5. 5.
    He X, Ma WY, Zhang H (2003) Imagerank: spectral techniques for structural analysis of image database. In: ICME ‘03: proceedings of the 2003 international conference on multimedia and expo. IEEE Computer Society, Washington, DC, USA, pp 25–28Google Scholar
  6. 6.
    Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1–2):177–196CrossRefzbMATHGoogle Scholar
  7. 7.
    Hsu WH, Kennedy LS, Chang SF (2007) Video search reranking through random walk over document-level context graph. In: MULTIMEDIA ’07: proceedings of the 15th international conference on multimedia. ACM, New York, NY, USA, pp 971–980CrossRefGoogle Scholar
  8. 8.
    Jing Y, Baluja S (2008) Visualrank: applying pagerank to large-scale image search. IEEE Trans Pattern Anal Mach Intell 30(11):1877–1890CrossRefGoogle Scholar
  9. 9.
    Kamvar SD, Haveliwala TH, Manning CD, Golub GH (2003) Extrapolation methods for accelerating pagerank computations. In: Proceedings of the twelfth international conference on world wide web. ACM Press, pp 261–270Google Scholar
  10. 10.
    Kennedy LS, Naaman M (2008) Generating diverse and representative image search results for landmarks. In: WWW ’08: proceeding of the 17th international conference on world wide web. ACM, New York, NY, USA, pp 297–306CrossRefGoogle Scholar
  11. 11.
    Langville AN, Meyer CD (2006) Updating markov chains with an eye on google’s pagerank. SIAM J Matrix Anal Appl 27(4):968–987CrossRefzbMATHMathSciNetGoogle Scholar
  12. 12.
    Li X, Snoek CG, Worring M (2008) Learning tag relevance by neighbor voting for social image retrieval. In: MIR ’08: proceeding of the 1st ACM international conference on multimedia information retrieval. ACM, New York, NY, USA, pp 180–187CrossRefGoogle Scholar
  13. 13.
    Li X, Wu C, Zach C, Lazebnik S, Frahm JM (2008) Modeling and recognition of landmark image collections using iconic scene graphs. In: Forsyth DA, Torr PHS, Zisserman A (eds) ECCV (1). Lecture notes in computer science, vol 5302. Springer, pp 427–440Google Scholar
  14. 14.
    Lienhart R, Slaney M (2007) pLSA on large scale image databases. In: IEEE international conference on acoustics, speech and signal processing 2007 (ICASSP 2007), vol IV, pp 1217–1220Google Scholar
  15. 15.
    Lienhart R, Romberg S, Hörster E (2009) Multilayer plsa for multimodal image retrieval. In: ACM international conference on image and video retrieval (CIVR)Google Scholar
  16. 16.
    Liu D, Hua XS, Yang L, Wang M, Zhang HJ (2009) Tag ranking. In: 18th international world wide web conference, pp 351–351Google Scholar
  17. 17.
    Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110CrossRefGoogle Scholar
  18. 18.
    Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, pp 2161–2168Google Scholar
  19. 19.
    Page L, Brin S, Motwani R, Winograd T (1998) The pagerank citation ranking: bringing order to the web. Tech. rep., Stanford Digital Library Technologies ProjectGoogle Scholar
  20. 20.
    Pan JY, Yang HJ, Faloutsos C, Duygulu P (2004) Gcap: graph-based automatic image captioning. In: Conference on computer vision and pattern recognition workshop, 2004. CVPRW 2004, pp 146–146Google Scholar
  21. 21.
    Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33(3):1065–1076CrossRefzbMATHMathSciNetGoogle Scholar
  22. 22.
    Raguram R, Lazebnik S (2008) Computing iconic summaries of general visual concepts. Computer vision and pattern recognition workshop. CVPRW 2008, vol 0, pp. 1–8Google Scholar
  23. 23.
    Romberg S, Hörster E, Lienhart R (2009) Multimodal plsa on visual features and tags. In: IEEE international conference on multimedia and expo (ICME)Google Scholar
  24. 24.
    Schroff F, Criminisi A, Zisserman A (2007) Harvesting image databases from the web. In: IEEE 11th international conference on computer vision, 2007. ICCV 2007, pp 1–8Google Scholar
  25. 25.
    Wang G, Forsyth D (2008) Object image retrieval by exploiting online knowledge resources. In: IEEE conference on computer vision and pattern recognition, pp 1–8Google Scholar
  26. 26.
    Zheng YT, Zhao M, Song Y, Adam H, Buddemeier U, Bissacco A, Brucher F, Chua TS, Neven H (2009) Tour the world: building a web-scale landmark recognition engine. In: Proc. of ICCV, Miami, Florida, USAGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Fabian Richter
    • 1
    Email author
  • Stefan Romberg
    • 1
  • Eva Hörster
    • 1
  • Rainer Lienhart
    • 1
  1. 1.Multimedia Computing LabUniversity of AugsburgAugsburgGermany

Personalised recommendations