Manifold Learning Based Cross-media Retrieval: A Solution to Media Object Complementary Nature

Zhuang, Yueting; Yang, Yi; Wu, Fei; Pan, Yunhe

doi:10.1007/s11265-006-0020-y

Yueting Zhuang¹,
Yi Yang¹,
Fei Wu¹ &
…
Yunhe Pan¹

192 Accesses
14 Citations
Explore all metrics

Abstract

Media objects of different modalities always exist jointly and they are naturally complementary of each other, either in the view of semantics or in the view of modality. In this paper, we propose a manifold learning based cross-media retrieval approach that gives solutions to the two intrinsically basic but crucial questions of media objects semantics understanding and cross-media retrieval. First, considering the semantic complementary, how can we represent the concurrent media objects and fuse the complementary information they carry to understand the integrated semantics precisely. Second, considering the modality complementary, how can we accomplish the modality bridge to establish the cross-index and facilitate the cross-media retrieval? To solve the two problems, we first construct a Multimedia Document (MMD) Semi-Semantic Graph (MMDSSG) and then adopt Multidimensional Scaling to create an MMD Semantic Space (MMDSS). Both long-term and short-term feedbacks are proposed to boost the system performance. The first one is used to refine the MMDSSG and the second one is adopted to introduce new items that are not in the training set into the MMDSS. Since all of the MMDs and their component media objects of different modalities lie in the MMDSS and they are indexed uniformly by their coordinates in the MMDSS regardless of their modalities, the semantic subspace is actually a bridge of media objects which are of different modalities and the cross-media retrieval can be easily achieved. Experiment results are encouraging and indicate that the proposed approach is effective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semisupervised Cross-Media Retrieval by Distance-Preserving Correlation Learning and Multi-modal Manifold Regularization

A cross-media distance metric learning framework based on multi-view correlation mining and matching

Article 21 April 2015

Cross-media retrieval based on semi-supervised regularization and correlation learning

Article 05 May 2018

References

H. J. Zhang and D. Zhong, “Schema for Visual Feature Based Image Retrieval [A],” in Proc. of Storage and Retrieval for Image and Video Database, USA, 1995. pp. 36–46.
J. Z. Wang, G. Wiederhold, O. Firschein, and S. X. Wei, “Content-based Image Indexing and Searching using Daubechies’ Wavelets,” Int. J. Digit. Libr., vol. 1, 1997, pp. 311–328.
Article Google Scholar
E. Chang, K. Goh, G. Sychay, and G. Wu, “CBSA: Content-based Soft Annotation for Multimodal Image Retrieval Using Bayes Point Machine,” IEEE Trans on Circuits and Systems for Video Technology, vol. 13, no. 1, 2003. (Jan.)
X. He, W. Y Ma, and H. J. Zhang, “Learning an Image Manifold for Retrieval,” ACM Multimedia Conference, New York, 2004.
Namunu C Maddage, Changsheng Xu., Mohan S Kankanhalli, and Xi Shao, “Content-based Music Structure Analysis with Applications to Music Semantics Understanding,” ACM Multimedia Conference, New York, 2004.
Guodong Guo and S. Z. Li, “Content-based Audio Classification and Retrieval by Support Vector Machines,” IEEE Trans. Neural Netw., vol. 14, no. 1, 2003, pp. 209–215. (Jan.)
Article Google Scholar
E. Wold, T. Blum, D. Keislar, and J. Wheaton, “Content-based Classification, Search and Retrieval of Audio,” IEEE Multimedia Mag., vol. 3, 1996, pp. 27–36. (July)
Article Google Scholar
S. W. Smoliar and HongJiang Zhang, “Content based Video Indexing and Retrieval,” IEEE Multimed., vol. 1, no. 2, 1994, pp. 62–72. (Summer)
Article Google Scholar
Jianping Fan, A. K. Elmagarmid, Xingquan Zhu, W. G. Aref, and Lide Wu, “ClassView: Hierarchical Video Shot Classification, Indexing, and Accessing,” Multimedia, IEEE Transactions on, vol. 6, no. 1, 2004, pp. 70–86. (Feb.)
Article Google Scholar
M. Y. Wu, C. Y. Chiu, S. P. Chao,S. N. Y, and H. C. Lin, “Content-based Retrieval for Human Motion Data,” 16th IPPR Conference on Computer Vision, Graphics and Image Processing (CVGIP 2003).
Meinard Müller, Tido Röder, Michael Clausen, “Efficient Content-based Retrieval of Motion Capture Data,” Proceedings of ACM SIGGRAPH 2005.
Y. Wang, Z. Liu, and J. Huang, “Multimedia Content Analysis Using Audio and Visual Information”, IEEE Signal Process. Mag., vol. 17, no. 6, 2000, pp. 12–36.
Article Google Scholar
K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft, “When is Nearest Neighbor,” meaningful? International Conference on Database Theory, 1999, pp. 217–235.
J. Yang, Y. T. Zhuang, and Q. Li, “Search for Multi-modality Data in Digital Libraries,” Proc. of 2nd IEEE Pacific-rim Conference on Multimedia, Beijing, China, 2001, pp. 482–489.
H. S. Seung and D. Lee, “The Manifold Ways of Perception,” Science, vol 290, 2000. (22 December)
J. B. Tenenbaum, V. D. Silva, and J. C. Langford, “A Global Geometric Framework for Nonlinear Dimensionality Reduction,” Science, vol 290, 2000. (22 December)
Fei Wu, Yi Yang, Yueting Zhuang, and Yunhe Pan, “Understanding Multimedia Document Semantics for Cross-Media Retrieval,” LNCS 3767(PCM 2005), pp. 993–1004.
J. B. Kruskal and M. Wish, “Multidimensional Scaling,” Sage, Beverly Hills, CA, 1977.
Google Scholar
Fei Wu, Hong Zhang, and Yueting Zhuang, “Learning Semantic Correlations for Cross Media Retrieval,” The 13th International Conference on Image Processing (ICIP) Atlanta, GA, USA, 2006.
H. Choi and S. Choi (2005), “Kernel Isomap on Noisy Manifold,” in Proc. IEEE Int’l Conf. Development and Learning (ICDL), pp. 208–213, Osaka, Japan, July 19–21, 2005.

Download references

Author information

Authors and Affiliations

College of Computer Science and Engineering, Zhejiang University, Hangzhou, People’s Republic of China
Yueting Zhuang, Yi Yang, Fei Wu & Yunhe Pan

Authors

Yueting Zhuang
View author publications
You can also search for this author in PubMed Google Scholar
Yi Yang
View author publications
You can also search for this author in PubMed Google Scholar
Fei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yunhe Pan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Yang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhuang, Y., Yang, Y., Wu, F. et al. Manifold Learning Based Cross-media Retrieval: A Solution to Media Object Complementary Nature. J VLSI Sign Process Syst Sign Image Video Technol 46, 153–164 (2007). https://doi.org/10.1007/s11265-006-0020-y

Download citation

Received: 30 August 2006
Accepted: 24 November 2006
Published: 03 February 2007
Issue Date: March 2007
DOI: https://doi.org/10.1007/s11265-006-0020-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Manifold Learning Based Cross-media Retrieval: A Solution to Media Object Complementary Nature

Abstract

Access this article

Similar content being viewed by others

Semisupervised Cross-Media Retrieval by Distance-Preserving Correlation Learning and Multi-modal Manifold Regularization

A cross-media distance metric learning framework based on multi-view correlation mining and matching

Cross-media retrieval based on semi-supervised regularization and correlation learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Manifold Learning Based Cross-media Retrieval: A Solution to Media Object Complementary Nature

Abstract

Access this article

Similar content being viewed by others

Semisupervised Cross-Media Retrieval by Distance-Preserving Correlation Learning and Multi-modal Manifold Regularization

A cross-media distance metric learning framework based on multi-view correlation mining and matching

Cross-media retrieval based on semi-supervised regularization and correlation learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation