Multimedia Tools and Applications

, Volume 76, Issue 6, pp 8831–8857 | Cite as

Social tag relevance learning via ranking-oriented neighbor voting

Article

Abstract

High quality tags play a critical role in applications involving online multimedia search, such as social image annotation, sharing and browsing. However, user-generated tags in real world are often imprecise and incomplete to describe the image contents, which severely degrades the performance of current search systems. To improve the descriptive powers of social tags, a fundamental issue is tag relevance learning, which concerns how to interpret the relevance of a tag with respect to the contents of an image effectively. In this paper, we investigate the problem from a new perspective of learning to rank, and develop a novel approach to facilitate tag relevance learning to directly optimize the ranking performance of tag-based image search. Specifically, a supervision step is introduced into the neighbor voting scheme, in which the tag relevance is estimated by accumulating votes from visual neighbors. Through explicitly modeling the neighbor weights and tag correlations, the risk of making heuristic assumptions is effectively avoided. Besides, our approach does not suffer from the scalability problem since a generic model is learned that can be applied to all tags. Extensive experiments on two benchmark datasets in comparison with the state-of-the-art methods demonstrate the promise of our approach.

Keywords

Tag-based image search Tag relevance learning Neighbor voting Learning to rank 

References

  1. 1.
    Ballan L, Bertini M, Uricchio T, Del Bimbo A (2014a) Data-driven approaches for social image and video tagging. Multimedia Tools and Applications 74 (4):1443–1468CrossRefGoogle Scholar
  2. 2.
    Ballan L, Uricchio T, Seidenari L, Del Bimbo A (2014b) A cross-media model for automatic image annotation. In: Proceedings of ACM International Conference on Multimedia Retrieval, pp 73–80Google Scholar
  3. 3.
    Chakrabarti S, Khanna R, Sawant U, Bhattacharyya C (2008) Structured learning for non-smooth ranking losses. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 88–96Google Scholar
  4. 4.
    Chen L, Xu D, Tsang IW, Luo J (2012) Tag-based image retrieval improved by augmented features and group-based refinement. IEEE Transactions on Multimedia 14(4):1057–1067CrossRefGoogle Scholar
  5. 5.
    Cheng Z, Shen J, Miao H (2014) The effects of multiple query evidences on social image retrieval. Multimedia Systems. doi:10.1007/s00530-014-0432-7 Google Scholar
  6. 6.
    Chua TS, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: a real-world web image database from national university of singapore. In: Proceedings of ACM International Conference on Image and Video Retrieval, pp 48:1–48:9Google Scholar
  7. 7.
    Cui C, Ma J, Lian T, Chen Z, Wang S (2015) Improving image annotation via ranking-oriented neighbor search and learning-based keyword propagation. Journal of the Association for Information Science and Technology 66(1):82–98CrossRefGoogle Scholar
  8. 8.
    Feng S, Feng Z, Jin R (2015) Learning to rank image tags with limited training examples. IEEE Trans Image Process 24(4):1223–1234MathSciNetCrossRefGoogle Scholar
  9. 9.
    Frome A, Corrado GS, Shlens J, Bengio S, Dean J, Mikolov T et al (2013) Devise: A deep visual-semantic embedding model. In: Proceedings of Advances in Neural Information Processing Systems, pp 2121–2129Google Scholar
  10. 10.
    Gong Y, Ke Q, Isard M, Lazebnik S (2014) A multi-view embedding space for modeling internet images, tags, and their semantics. Int J Comput Vis 106(2):210–233CrossRefGoogle Scholar
  11. 11.
    Huiskes M, Lew M (2008) The mir flickr retrieval evaluation. In: Proceedings of ACM International Conference on Multimedia Information Retrieval, pp 39–43Google Scholar
  12. 12.
    Joachims T (2005) A support vector method for multivariate performance measures. In: Proceedings of the International Conference on Machine learning, pp 377–384Google Scholar
  13. 13.
    Joachims T, Finley T, Yu C (2009) Cutting-plane training of structural svms. Mach Learn 77(1):27–59CrossRefMATHGoogle Scholar
  14. 14.
    Kennedy LS, Chang SF, Kozintsev IV (2006) To search or to label?: Predicting the performance of search-based automatic image classifiers. In: Proceedings of ACM International Workshop on Multimedia Information Retrieval, pp 249–258Google Scholar
  15. 15.
    Lee S, De Neve W, Ro YM (2014) Visually weighted neighbor voting for image tag relevance learning. Multimedia tools and applications 72(2):1363–1386Google Scholar
  16. 16.
    Li X (2014) Tag relevance fusion for social image retrieval. Multimedia Systems. doi:10.1007/s00530-014-0430-9 Google Scholar
  17. 17.
    Li X, Snoek CG (2009) Visual categorization with negative examples for free. In: Proceedings of ACM International Conference on Multimedia, pp 661–664Google Scholar
  18. 18.
    Li X, Chen L, Zhang L, Lin F, Ma WY (2006) Image annotation by large-scale content-based image retrieval. In: Proceedings of ACM International Conference on Multimedia, pp 607–610Google Scholar
  19. 19.
    Li X, Snoek C, Worring M (2009) Learning social tag relevance by neighbor voting. IEEE Trans Multimed 11(7):1310–1322CrossRefGoogle Scholar
  20. 20.
    Li X, Snoek C, Worring M (2010) Unsupervised multi-feature tag relevance learning for social image retrieval. In: Proceedings of ACM International Conference on Image and Video Retrieval, pp 10–17Google Scholar
  21. 21.
    Li X, Snoek CG, Worring M, Koelma D, Smeulders AW (2013) Bootstrapping visual categorization with relevant negatives. IEEE Trans Multimed 15 (4):933–945CrossRefGoogle Scholar
  22. 22.
    Li X, Uricchio T, Ballan L, Bertini M, Snoek CGM, Bimbo AD (2015) Socializing the semantic gap: a comparative survey on image tag assignment, refinement and retrieval. CoRR arXiv:1503.08248
  23. 23.
    Lin Z, Ding G, Hu M, Wang J, Ye X (2013) Image tag completion via image-specific and tag-specific linear sparse reconstructions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 1618–1625Google Scholar
  24. 24.
    Liu D, Hua XS, Yang L, Wang M, Zhang HJ (2009) Tag ranking. In: Proceedings of ACM International Conference on World Wide Web, pp 351–360Google Scholar
  25. 25.
    Liu D, Hua XS, Wang M, Zhang HJ (2010) Image retagging. In: Proceedings of ACM International Conference on Multimedia, pp 491–500Google Scholar
  26. 26.
    Muja M, Lowe DG (2009) Fast approximate nearest neighbors with automatic algorithm configuration. In: Proceedings of the International Conference on Computer Vision Theory and Applications, pp 331–340Google Scholar
  27. 27.
    Sang J, Xu C, Liu J (2012) User-aware image tag refinement via ternary semantic analysis. IEEE Trans Multimed 14(3):883–895CrossRefGoogle Scholar
  28. 28.
    Shalev-Shwartz S, Singer Y, Srebro N, Cotter A (2011) Pegasos: Primal estimated sub-gradient solver for svm. Math Program 127(1):3–30MathSciNetCrossRefMATHGoogle Scholar
  29. 29.
    Shen J, Wang M, Yan S, Hua XS (2011) Multimedia tagging: past, present and future. In: Proceedings of ACM International Conference on Multimedia, pp 639–640Google Scholar
  30. 30.
    Shen Y, Fan J (2010) Leveraging loosely-tagged images and inter-object correlations for tag recommendation. In: Proceedings of ACM International Conference on Multimedia, pp 5–14Google Scholar
  31. 31.
    Sigurbjrnsson B, Van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceedings of ACM International Conference on World Wide Web, pp 327–336Google Scholar
  32. 32.
    Smucker MD, Allan J, Carterette B (2007) A comparison of statistical significance tests for information retrieval evaluation. In: Proceedings of ACM International Conference on Information and Knowledge Management, pp 623–632Google Scholar
  33. 33.
    Snoek CGM, Worring M, van Gemert JC, Geusebroek JM, Smeulders AWM (2006) The challenge problem for automated detection of 101 semantic concepts in multimedia. In: Proceedings of ACM International Conference on Multimedia, pp 421–430Google Scholar
  34. 34.
    Tang J, Hong R, Yan S, Chua T S, Qi G J, Jain R (2011) Image annotation by knn-sparse graph-based label propagation over noisily tagged web images. ACM Transactions on Intelligent Systems and Technology 2(2):14:1–14:15CrossRefGoogle Scholar
  35. 35.
    Teo CH, Vishwanthan S, Smola AJ, Le QV (2010) Bundle methods for regularized risk minimization. J Mach Learn Res 11:311–365MathSciNetMATHGoogle Scholar
  36. 36.
    Truong BQ, Sun A, Bhowmick SS (2012) Content is still king: the effect of neighbor voting schemes on tag relevance for social image retrieval. In: Proceedings of ACM International Conference on Multimedia Retrieval, pp 9:1–9:8Google Scholar
  37. 37.
    Uricchio T, Ballan L, Bertini M, Del Bimbo A (2013) An evaluation of nearest-neighbor methods for tag refinement. In: Proceedings of IEEE International Conference on Multimedia and Expo, pp 1–6Google Scholar
  38. 38.
    Verbeek J, Guillaumin M, Mensink T, Schmid C (2010) Image annotation with tagprop on the mirflickr set. In: Proceedings of ACM International Conference on Multimedia Information Retrieval, pp 537–546Google Scholar
  39. 39.
    Wang J, Zhou J, Xu H, Mei T, Hua X S, Li S (2014) Image tag refinement by regularized latent dirichlet allocation. Comput Vis Image Underst 124:61–70CrossRefGoogle Scholar
  40. 40.
    Wang M, Yang K, Hua X, Zhang H (2010a) Towards a relevant and diverse search of social images. IEEE Trans Multimed 12(8):829–842CrossRefGoogle Scholar
  41. 41.
    Wang Z, Feng J, Zhang C, Yan S (2010b) Learning to rank tags. In: Proceedings of ACM International Conference on Image and Video Retrieval, pp 42–49Google Scholar
  42. 42.
    Weston J, Bengio S, Usunier N (2011) Wsabie: Scaling up to large vocabulary image annotation. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp 2764–2770Google Scholar
  43. 43.
    Wu L, Yang L, Yu N, Hua X (2009) Learning to tag. In: Proceedings of ACM International Conference on World Wide Web, pp 361–370Google Scholar
  44. 44.
    Wu L, Jin R, Jain AK (2013) Tag completion for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(3):716–727CrossRefGoogle Scholar
  45. 45.
    Yang Y, Yang Y, Huang Z, Shen H, Nie F (2011) Tag localization with spatial correlations and joint group sparsity. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 881–888Google Scholar
  46. 46.
    Yue Y, Finley T, Radlinski F, Joachims T (2007) A support vector method for optimizing average precision. In: Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval, pp 271–278Google Scholar
  47. 47.
    Zhao R, Grosky W (2002) Narrowing the semantic gap-improved text-based web document retrieval using visual features. IEEE Trans Multimed 4(2):189–200CrossRefGoogle Scholar
  48. 48.
    Zhou B, Jagadeesh V, Piramuthu R (2014) Conceptlearner: Discovering visual concepts from weakly labeled image collections. CoRR arXiv:1411.5328
  49. 49.
    Zhu G, Yan S, Ma Y (2010) Image tag refinement towards low-rank, content-tag prior and error sparsity. In: Proceedings of ACM International Conference on Multimedia, pp 461–470Google Scholar
  50. 50.
    Zhu X, Nejdl W, Georgescu M (2014) An adaptive teleportation random walk model for learning social tag relevance. In: Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval, pp 223–232Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyShandong University of Finance and EconomicsJinanChina
  2. 2.School of Information SystemsSingapore Management UniversitySingaporeSingapore
  3. 3.School of Computer Science and TechnologyShandong UniversityJinanChina

Personalised recommendations