Advertisement

Text Mining in Multimedia

  • Zheng-Jun Zha
  • Meng Wang
  • Jialie Shen
  • Tat-Seng Chua
Chapter

Abstract

A large amount of multimedia data (e.g., image and video) is now available on the Web. A multimedia entity does not appear in isolation, but is accompanied by various forms of metadata, such as surrounding text, user tags, ratings, and comments etc. Mining these textual metadata has been found to be effective in facilitating multimedia information processing and management. A wealth of research efforts has been dedicated to text mining in multimedia. This chapter provides a comprehensive survey of recent research efforts. Specifically, the survey focuses on four aspects: (a) surrounding text mining; (b) tag mining; (c) joint text and visual content mining; and (d) cross text and visual content mining. Furthermore, open research issues are identified based on the current research efforts.

Keywords

Text Mining Multimedia Surrounding Text Tagging Social Network 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Altavistas a/v photo finder. http://www.altavista.com/sites/search/simage.Google Scholar
  2. 2.
    C. C. Aggarwal, H. Wang. Text Mining in Social Networks. Social Network Data Analytics, Springer, 2011.Google Scholar
  3. 3.
    D. Cai, X. He, Z. Li, W.-Y. Ma, and J.-R. Wen. Hierarchical clustering of www image search results using visual, textual and link information. In Proceedings of the ACM Conference on Multimedia, 2004.Google Scholar
  4. 4.
    S.-F. Chang, W. Hsu, W. Jiang, L. Kennedy, D. Xu, A. Yanagawa, and E. Zavesky. Columbia university trecvid-2006 video search and high-level feature extraction. In Proceedings of NIST TRECVID workshop, 2006.Google Scholar
  5. 5.
    L. Chen and A. Roy. Event detection from Flickr data through wavelet-based spatial analysis. In Proceedings of the ACM conference on Information and knowledge management, pages 523532. ACM, 2009.Google Scholar
  6. 6.
    L. Chen, D. Xu, I. W. Tsang, and J. Luo. Tag-based web photo retrieval improved by batch mode re-tagging. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, 2010.Google Scholar
  7. 7.
    W. Dai, Y. Chen, G.-R. Xue, Q. Yang, and Y. Yu. Translated learning: Transfer learning across difference feature spaces. In NIPS, pages 353360, 2008.Google Scholar
  8. 8.
    J. Fan, Y. Shen, N. Zhou, and Y. Gao. Harvesting large-scaleweaklytagged image databases from the web. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, 2010.Google Scholar
  9. 9.
    H. Feng, R. Shi, and T.-S. Chua. A bootstrapping framework for annotating and retrieving www images. In Proceedings of the ACM Conference on Multimedia, 2004.Google Scholar
  10. 10.
    S. Feng, C. Lang, and D. Xu. Beyond tag relevance: integrating visual attention model and multi-instance learning for tag saliency ranking. In Proceedings of International Conference on Image and Video Retrieval, 2010.Google Scholar
  11. 11.
    R. Fergus, P. Perona, and A. Zisserman. A visual category filter for google images. In Proceedings of the European Conference on Computer Vision, 2004.Google Scholar
  12. 12.
    C. Frankel, M. J. Swain, and V. Athitsos. Webseer: An image search engine for the world wide web. Technical report, University of Chicago, Computer Science Department, 1996.Google Scholar
  13. 13.
    B. Gao, T.-Y. Liu, Q. Tao, X. Zheng, Q. Cheng, and W.-Y. Ma. Web image clustering by consistent utilization of visual features and surrounding texts. In Proceedings of the ACM Conference on Multimedia, 2005.Google Scholar
  14. 14.
    B. Geng, L. Yang, C. Xu, and X.-S. Hua. Content-aware ranking for visual search. In Proceedings of the International Conference on Computer Vision and Pattern Recognition, 2010.Google Scholar
  15. 15.
    G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology, 2007.Google Scholar
  16. 16.
    W. Hsu, L. Kennedy,, and S.-F. Chang. Reranking methods for visual search. IEEE Multimedia, 14:1422, 2007.CrossRefGoogle Scholar
  17. 17.
    F. Jing and S. Baluja. Visualrank: Applying pagerank to large-scale image search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30:18771890, 2008.CrossRefGoogle Scholar
  18. 18.
    F. Jing, M. Li, H.-J. Zhang, and B. Zhang. A unified framework for image retrieval using keyword and visual features. IEEE Transactions on Image Processing, 2005.Google Scholar
  19. 19.
    F. Jing, C. Wang, Y. Yao, K. Deng, L. Zhang, and W.-Y. Ma. Igroup: Web image search results clustering. In Proceedings of the ACM Conference on Multimedia, pages 377384, 2006.Google Scholar
  20. 20.
    L. S. Kennedy, S. F. Chang, and I. V. Kozintsev. To search or to label? predicting the performance of search-based automatic image classifiers. In Proceedings of the ACM International Workshop on Multimedia Information Retrieval, 2006.Google Scholar
  21. 21.
    G. Li, M. Wang, Y. T. Zheng, Z.-J. Zha, H. Li, and T.-S. Chua. Shottagger: Tag location for internet videos. In Proceedings of the ACM International Conference on Multimedia Retrieval, 2011.Google Scholar
  22. 22.
    X. Li, C. G. Snoek, and M. Worring. Learning social tag relevance by neighbor voting. Pattern Recognition Letters, 11(7), 2009.Google Scholar
  23. 23.
    X. Li, C. G. Snoek, and M. Worring. Unsupervised multi-feature tag relevance learning for social image retrieval. In Proceedings of the International Conference on Image and Video Retrieval, 2010.Google Scholar
  24. 24.
    D. Liu, X. C. Hua, M. Wang, and H. Zhang. Image retagging. In Proceedings of the ACM Conference on Multimedia, 2010.Google Scholar
  25. 25.
    D. Liu, X.-S. Hua, L. Yang, M.Wang, and H.-J. Zhang. Tag ranking. In Proceedings of the International Conference on World Wide Web, 2009.Google Scholar
  26. 26.
    D. Liu, X.-S. Hua, and H.-J. Zhang. Content-based tag processing for internet social images. Multimedia Tools and Application, 51:723738, 2010.CrossRefGoogle Scholar
  27. 27.
    D. Liu, S. Yan, Y. Rui, and H. J. Zhang. Unified tag analysis with multi-edge graph. In Proceedings of the ACM Conference on Multimedia, 2010.Google Scholar
  28. 28.
    X. Liu, B. Cheng, S. Yan, J. Tang, T. C. Chua, and H. Jin. Label to region by bi-layer sparsify priors. In Proceedings of the ACM Conference on Multimedia, 2009.Google Scholar
  29. 29.
    X. Liu, S. Yan, J. Luo, J. Tang, Z. Huang, and H. Jin. Nonparametric label-to-region by search. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, 2010.Google Scholar
  30. 30.
    Y. Liu, T. Mei, and X.-S. Hua. Crowdreranking: Exploring multiple search engines for visual search reranking. In Proceedings of the ACM SIGIR Conference, 2009.Google Scholar
  31. 31.
    T. Mei, Z.-J. Zha, Y. Liu, M. Wang, and et al. Msra at trecvid 2008: High-level feature extraction and automatic search. In Proceedings of NIST TRECVID workshop, 2008.Google Scholar
  32. 32.
    S. J. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 2010.Google Scholar
  33. 33.
    G.-J. Qi, C. C. Aggarwal, and T. Huang. Towards semantic knowledge propagation from text corpus to web images. In Proceedings of the International Conference on World Wide Web, 2011.Google Scholar
  34. 34.
    M. Rege, M. Dong, and J. Hua. Graph theoretical framework for simultaneously integrating visual and textual features for efficient web image clustering. In Proceedings of the International Conference on World Wide Web, 2008.Google Scholar
  35. 35.
    F. Schroff, A. Criminisi, and A. Zisserman. Harvesting images databases from the web. In Proceedings of the International Conference on Computer Vision, 2007.Google Scholar
  36. 36.
    D. A. Shamma, R. Shaw, P. L. Shafton, and Y. Liu. Watch what i watch: using community activeity to understand content. In Proceedings of the ACM Workshop on Multimedia Information Retrieval, 2007.Google Scholar
  37. 37.
    X. Shi, Q. Liu, W. Fan, P. S. Yu, and R. Zhu. Transfer learning on heterogenous feature spaces via spectral tranformation. In Proceedings of the International Conference on Data Mining, 2010.Google Scholar
  38. 38.
    B. Sigurbj¨ornsson and R. V. Zwol. Flickr tag recommendation based on collective knowledge. In Proceedings of International Conference on World Wide Web, 2008.Google Scholar
  39. 39.
    J. Smith and S.-F. Chang. Visually searching the web for content. IEEE Multimedia, 4:1220, 1995.CrossRefGoogle Scholar
  40. 40.
    R. Srihari. Automatic indexing and content-based retrieval of captioned images. IEEE Computer, 28:4956, 1995.CrossRefGoogle Scholar
  41. 41.
    A. Sun and S. S. Bhowmick. Quantifying tag representativeness of visual content of social images. In Proceedings of the ACM Conference on Multimedia, 2010.Google Scholar
  42. 42.
    X. Tian, L. Yang, J. Wang, Y. Yang, X. Wu, and X.-S. Hua. Bayesian video search reranking. In Proceedings of the ACM Conference on Multimedia, 2008.Google Scholar
  43. 43.
    A. Ulges, C. Schulze, D. Keysers, and T. M. Breuel. Identifying relevant frames in weakly labeled videos for training concept detectors. In Proceedings of the International Conference on Image and Video Retrieval, 2008.Google Scholar
  44. 44.
    G. Wang and D. A. Forsyth. Object image retrieval by exploiting online knowledge resources. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2008.Google Scholar
  45. 45.
    J.Wang, Y.-G. Jiang, and S.-F. Chang. Label diagnosis through self tuning for web image search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  46. 46.
    M. Wang, X. S. Hua, R. Hong, J. Tang, G. J. Qi, and Y. Song. Unified video annotation via multi-graph learning. IEEE Transactions on Circuits and Systems for Video Technology, 19(5), 2009.Google Scholar
  47. 47.
    M. Wang, X. S. Hua, J. Tang, and R. Hong. Beyond distance measurement: Constructing neighborhood similarity for video annotation. IEEE Transactions on Multimedia, 11(3), 2009.Google Scholar
  48. 48.
    M. Wang, B. Ni, X.-S. Hua, and T.-S. Chua. Assistive multimedia tagging: A survey of multimedia tagging with human-computer joint exploration. ACM Computing Survey, 2011.Google Scholar
  49. 49.
    X.-J. Wang, W.-Y. Ma, G.-R. Xue, and X. Li. Multi-model similarity propagation and its application for web image retrieval. In Proceedings of the ACM Conference on Multimedia, pages 944951, 2004.Google Scholar
  50. 50.
    X.-J. Wang, W.-Y. Ma, L. Zhang, and X. Li. Iteratively clustering web images based on link and attribute reinforcements. In Proceedings of the ACM Conference on Multimedia, 2005.Google Scholar
  51. 51.
    L. Wu, X.-S. Hua, N. Yu, W.-Y. Ma, and S. Li. Flickr distance. In Proceedings of the ACM Conference on Multimedia, 2008.Google Scholar
  52. 52.
    H. Xu, J.Wang, X.-S. Hua, and S. Li. Tag refinement by regularized LDA. In Proceedings of the ACM Conference on Multimedia, 2009.Google Scholar
  53. 53.
    R. Yan and A. G. Hauptmann. Co-retrieval: A boosted reranking approach for video retrieval. In Proceedings of the ACM Conference on Image and Video Retrieval, 2004.Google Scholar
  54. 54.
    R. Yan, A. G. Hauptmann, and R. Jin. Multimedia search with pseudo-relevance feedback. In Proceedings of the ACM Conference on Image and Video Retrieval, 2003.Google Scholar
  55. 55.
    K. Yang, X.-S. Hua, M. Wang, and H. C. Zhang. Tagging tags. In Proceedings of the ACM Conference on Multimedia, 2010.Google Scholar
  56. 56.
    Q. Yang, Y. Chen, G.-R. Xue, W. Dai, and Y. Yu. Heterogeneous transfer learning from image clustering via the social web. In Proceedings of the Joint Conference of the Annual Meeting of the ACL, 2009.Google Scholar
  57. 57.
    Y.-H. Yang, P. Wu, C. W. Lee, K. H. Lin, W. Hsu, and H. H. Chen. Contextseer: Context search and recommendation at query time for shared consumer photos. In Proceedings of the ACM Conference on Multimedia, 2008.Google Scholar
  58. 58.
    Z.-J. Zha, X.-S. Hua, T. Mei, J. Wang, G.-J. Qi, and Z. Wang. Joint multi-label multi-instance learning for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2008.Google Scholar
  59. 59.
    Z.-J. Zha, T. Mei, J. Wang, X.-S. Hua, and Z. Wang. Graph-based semi-supervised learning with multiple labels. Journal of Visual Communication and Image Representation, 2009.Google Scholar
  60. 60.
    Z.-J. Zha, M. Wang, Y.-T. Zheng, Y. Yang, R. Hong, and T.-S. Chua. Interactive video indexing with statistical active learning. IEEE Transactions on Multimedia, 2011.Google Scholar
  61. 61.
    Z.-J. Zha, L. Yang, T. Mei, M. Wang, and Z. Wang. Viusal query suggestion. In Proceedings of the ACM Conference on Multimedia, 2009.Google Scholar
  62. 62.
    R. Zhang, Z. M. Zhang, M. Li, W.-Y. Ma, and H.-J. Zhang. A probabilistic semantic model for image annotation and multi-modal image retrieval. In Proceedings of the International Conference on Computer Vision, pages 846851, 2005.Google Scholar
  63. 63.
    R. Zhao and W. I. Grosky. Narrowing the semantic gap - improved text-based web document retireval using visual fetures. IEEE Transactions on Multimedia, 4, 2002.Google Scholar
  64. 64.
    G. Zhu, S. Yan, and Y. Ma. Image tag refinement towards lowrank, content-tag prior and error sparsity. In Proceedings of the ACM Conference on Multimedia, 2010.Google Scholar
  65. 65.
    Y. Zhu, Y. Chen, Z. Lu, S. J. Pan, G.-R. Xue, Y. Yu, and Q. Yang. Heterogeneous transfer learning for image classification. In Proceedings of the AAAI Conference on Artificial Intelligence, 2011.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Zheng-Jun Zha
    • 1
  • Meng Wang
    • 1
  • Jialie Shen
    • 2
  • Tat-Seng Chua
    • 1
  1. 1.School of ComputingNational University of SingaporeSingaporeSingapore
  2. 2.Singapore Management UniversitySingaporeSingapore

Personalised recommendations