Skip to main content
Log in

Tag relevance fusion for social image retrieval

  • Special Issue Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Due to the subjective nature of social tagging, measuring the relevance of social tags with respect to the visual content is crucial for retrieving the increasing amounts of social-networked images. Witnessing the limit of a single measurement of tag relevance, we introduce in this paper tag relevance fusion as an extension to methods for tag relevance estimation. We present a systematic study, covering tag relevance fusion in early and late stages, and in supervised and unsupervised settings. Experiments on a large present-day benchmark set show that tag relevance fusion leads to better image retrieval. Moreover, unsupervised tag relevance fusion is found to be practically as effective as supervised tag relevance fusion, but without the need of any training efforts. This finding suggests the potential of tag relevance fusion for real-world deployment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. http://pan.baidu.com/s/1gdd3dBH.

  2. http://lms.comp.nus.edu.sg/research/NUS-WIDE.htm. As some images are no longer available on Flickr, the dataset used in this paper are a bit smaller than the original release.

References

  1. Aslam, J., Montague, M.: Models for metasearch. In: SIGIR (2001)

  2. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley, Boston (1999)

    Google Scholar 

  3. Brown, G., Wyatt, J., Harris, R., Yao, X.: Diversity creation methods: a survey and categorisation. Int. J. Inf. Fusion 6(1), 5–20 (2005)

    Article  Google Scholar 

  4. Chen, L., Xu, D., Tsang, I., Luo, J.: Tag-based image retrieval improved by augmented features and group-based refinement. IEEE Trans. Multimed. 14(4), 1057–1067 (2012)

    Article  Google Scholar 

  5. Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.T.: NUS-WIDE: a real-world web image database from National University of Singapore. In: CIVR (2009)

  6. Datta, R., Joshi, D., Li, J., Wang, J.: Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 1–60 (2008)

    Article  Google Scholar 

  7. Gao, Y., Wang, M., Luan, H., Shen, J., Yan, S., Tao, D.: Tag-based social image search with visual-text joint hypergraph learning. In: ACM multimedia (2011)

  8. Gao, Y., Wang, M., Zha, Z.J., Shen, J., Li, X., Wu, X.: Visual-textual joint relevance learning for tag-based social image search. IEEE Trans. Image Process. 22(1), 363–376 (2013)

    Article  MathSciNet  Google Scholar 

  9. Gehler, P., Nowozin, S.: Let the kernel figure it out; principled learning of pre-processing for kernel classifiers. In: CVPR (2009)

  10. Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C.: TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. In: ICCV (2009)

  11. Heikkilä, M., Pietikäinen, M., Schmid, C.: Description of interest regions with local binary patterns. Pattern Recogn. 42, 425–436 (2009)

    Article  MATH  Google Scholar 

  12. Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20, 422–446 (2002)

    Article  Google Scholar 

  13. Jaynes, E.: Probability Theory: The Logic of Science. Cambridge University Press, Cambridge (2003)

    Book  MATH  Google Scholar 

  14. Kennedy, L., Naaman, M., Ahern, S., Nair, R., Rattenbury, T.: How Flickr helps us make sense of the world: context and content in community-contributed media collections. In: ACM multimedia (2007)

  15. Lee, S., De Neve, W., Ro, Y.: Image tag refinement along the ’what’ dimension using tag categorization and neighbor voting. In: ICME (2010)

  16. Li, M.: Texture moment for content-based image retrieval. In: ICME (2007)

  17. Li, X., Liao, S., Liu, B., Yang, G., Jin, Q., Xu, J., Du, X.: Renmin University of China at ImageCLEF 2013 scalable concept image annotation. In: CLEF working notes (2013)

  18. Li, X., Snoek, C.: Classifying tag relevance with relevant positive and negative examples. In: ACM multimedia (2013)

  19. Li, X., Snoek, C., Worring, M.: Learning social tag relevance by neighbor voting. IEEE Trans. Multimed. 11(7), 1310–1322 (2009)

    Article  Google Scholar 

  20. Li, X., Snoek, C., Worring, M.: Unsupervised multi-feature tag relevance learning for social image retrieval. In: CIVR (2010)

  21. Li, X., Snoek, C., Worring, M., Koelma, D., Smeulders, A.: Bootstrapping visual categorization with relevant negatives. IEEE Trans. Multimed. 15(4), 933–945 (2013)

    Article  Google Scholar 

  22. Li, Z., Zhang, L., Ma, W.Y.: Delivering online advertisements inside images. In: ACM Multimedia (2008)

  23. Liu, D., Hua, X.S., Wang, M., Zhang, H.J.: Image retagging. In: ACM Multimedia (2010)

  24. Liu, D., Hua, X.S., Yang, L., Wang, M., Zhang, H.J.: Tag ranking. In: WWW (2009)

  25. Liu, T.Y.: Learning to rank for information retrieval. Found. Trends Inf. Retr. 3(3), 225–331 (2009)

    Article  Google Scholar 

  26. Lu, Y., Zhang, L., Liu, J., Tian, Q.: Constructing concept lexica with small semantic gaps. IEEE Trans. Multimed. 12(4), 288–299 (2010)

    Article  Google Scholar 

  27. Maji, S., Berg, A., Malik, J.: Classification using intersection kernel support vector machines is efficient. In: CVPR, pp. 1–8 (2008)

  28. Makadia, A., Pavlovic, V., Kumar, S.: Baselines for image annotation. Int. J. Comput. Vis. 90(1), 88–105 (2010)

    Article  Google Scholar 

  29. Matusiak, K.: Towards user-centered indexing in digital image collections. OCLC Syst. Serv. 22(4), 283–298 (2006)

    Article  Google Scholar 

  30. Metzler, D., Croft, B.: Linear feature-based models for information retrieval. Inf. Retr. 10(3), 257–274 (2007)

    Article  Google Scholar 

  31. Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)

    Article  MATH  Google Scholar 

  32. Radovanović, M., Nanopoulos, A., Ivanović, M.: Hubs in space: Popular nearest neighbors in high-dimensional data. J. Mach. Learn. Res. 11, 2487–2531 (2010)

    MathSciNet  MATH  Google Scholar 

  33. van de Sande, K., Gevers, T., Snoek, C.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1582–1596 (2010)

    Article  Google Scholar 

  34. Smucker, M., Allan, J., Carterette, B.: A comparison of statistical significance tests for information retrieval evaluation. In: CIKM (2007)

  35. Snoek, C., Worring, M., Smeulders, A.: Early versus late fusion in semantic video analysis. In: ACM Multimedia (2005)

  36. Sun, A., Bhowmick, S.: Quantifying tag representativeness of visual content of social images. In: ACM multimedia (2010)

  37. Sun, A., Bhowmick, S., Nguyen, K., Bai, G.: Tag-based social image retrieval: an empirical evaluation. J. Am. Soc. Inf. Sci. Technol. 62(12), 2364–2381 (2011)

    Article  Google Scholar 

  38. Tang, J., Hong, R., Yan, S., Chua, T.S., Qi, G.J., Jain, R.: Image annotation by k nn-sparse graph-based label propagation over noisily tagged web images. ACM Trans. Intell. Syst. Technol. 2, 14:1–14:15 (2011)

    Article  Google Scholar 

  39. Uricchio, T., Ballan, L., Bertini, M., Del Bimbo, A.: An evaluation of nearest-neighbor methods for tag refinement. In: ICME (2013)

  40. Wang, D., Liu, X., Luo, L., Li, J., Zhang, B.: Video diver: generic video indexing with diverse features. In: ACM MIR (2007)

  41. Wang, G., Hoiem, D., Forsyth, D.: Building text features for object image classification. In: CVPR (2009)

  42. Wang, J., Li, J., Wiederhold, G.: SIMPLIcity: semantics-sensitive integrated matching for picture libraries. IEEE Trans. Pattern Anal. Mach. Intell. 23, 947–963 (2001)

    Article  Google Scholar 

  43. Wang, M., Hua, X.S., Hong, R., Tang, J., Qi, G.J., Song, Y.: Unified video annotation via multigraph learning. IEEE Trans. Circuit Syst. Video Technol. 19, 733–746 (2009)

    Article  Google Scholar 

  44. Wu, Y., Chang, E., Chang, K., Smith, J.: Optimal multimodal fusion for multimedia data analysis. In: ACM multimedia (2004)

  45. Xu, H., Wang, J., Hua, X.S., Li, S.: Tag refinement by regularized LDA. In: ACM multimedia (2009)

  46. Yang, Y., Gao, Y., Zhang, H., Shao, J., Chua, T.S.: Image tagging with social assistance. In: ICMR (2014)

  47. Yeh, T., Lee, J., Darrell, T.: Photo-based question answering. In: ACM multimedia (2008)

  48. Zha, Z.J., Yang, L., Mei, T., Wang, M., Wang, Z., Chua, T.S., Hua, X.S.: Visual query suggestion: Towards capturing user intent in internet image search. ACM Trans. Multimed. Comput. Commun. Appl. 6(3), 13:1–13:19 (2010)

    Article  Google Scholar 

  49. Zhang, L., Gao, Y., Hong, C., Feng, Y., Zhu, J., Cai, D.: Feature correlation hypergraph: exploiting high-order potentials for multimodal recognition. IEEE Trans. Cybernet. 44(8), 1408–1419 (2014)

    Article  Google Scholar 

  50. Zhang, L., Gao, Y., Xia, Y., Dai, Q., Li, X.: A fine-grained image categorization system by cellet-encoded spatial pyramid modeling. IEEE Trans. Ind. Electron. (2014). doi:10.1109/TIE.2014.2327558

  51. Zhang, L., Han, Y., Yang, Y., Song, M., Yan, S., Tian, Q.: Discovering discriminative graphlets for aerial image categories recognition. IEEE Trans. Image Process.22(2), 5071–5084 (2013)

    Article  MathSciNet  Google Scholar 

  52. Zhang, L., Rui, Y.: Image search-from thousands to billions in 20 years. ACM Trans. Multimed. Comput. Commun. Appl. 9(1), 36:1–36:20 (2013)

    Google Scholar 

  53. Zhang, L., Song, M., Liu, X., Bu, J., Chen, C.: Fast multi-view segment graph kernel for object classification. Signal Process. 93(6), 1597–1607 (2013)

    Article  Google Scholar 

  54. Zhang, L., Song, M., Liu, X., Sun, L., Chen, C., Bu, J.: Recognizing architecture styles by hierarchical sparse coding of blocklets. Inf. Sci. 254, 141–154 (2014)

    Article  Google Scholar 

  55. Zhang, L., Yang, Y., Gao, Y., Yu, Y., Wang, C., Li, X.: A probabilistic associative model for segmenting weakly supervised images. IEEE Trans. Image Process. 23(9), 4150–4159 (2014)

    Article  MathSciNet  Google Scholar 

  56. Zhu, G., Yan, S., Ma, Y.: Image tag refinement towards low-rank, content-tag prior and error sparsity. In: ACM multimedia (2010)

  57. Zhu, S., Jiang, Y.G., Ngo, C.W.: Sampling and ontologically pooling web images for visual concept learning. IEEE Trans. Multimed. 14(4), 1068–1078 (2012)

    Article  Google Scholar 

Download references

Acknowledgments

The author is grateful to Dr. Cees Snoek and Dr. Marcel Worring for their very useful comments on this work. The research was supported by NSFC (No. 61303184), SRFDP (No. 20130004120006), the Fundamental Research Funds for the Central Universities and the Research Funds of Renmin University of China (No. 14XNLQ01), and Shanghai Key Laboratory of Intelligent Information Processing, China (Grant No. IIPL-2014-002).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xirong Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, X. Tag relevance fusion for social image retrieval. Multimedia Systems 23, 29–40 (2017). https://doi.org/10.1007/s00530-014-0430-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-014-0430-9

Keywords

Navigation