Measuring Multi-modality Similarities Via Subspace Learning for Cross-Media Retrieval

Zhang, Hong; Weng, Jianguang

doi:10.1007/11922162_111

Hong Zhang²⁰ &
Jianguang Weng²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4261))

Included in the following conference series:

Pacific-Rim Conference on Multimedia

839 Accesses
15 Citations

Abstract

Cross-media retrieval is an interesting research problem, which seeks to breakthrough the limitation of modality so that users can query multimedia objects by examples of different modalities. In order to cross-media retrieve, the problem of similarity measure between media objects with heterogeneous low-level features needs to be solved. This paper proposes a novel approach to learn both intra- and inter-media correlations among multi-modality feature spaces, and construct MLE semantic subspace containing multimedia objects of different modalities. Meanwhile, relevance feedback strategies are developed to enhance the efficiency of cross-media retrieval from both short- and long-term perspectives. Experiments show that the result of our approach is encouraging and the performance is effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

He, X., Ma, W.Y., Zhang, H.J.: Learning an Image Manifold for Retrieval. In: ACM Multimedia Conference, New York (2004)
Google Scholar
Chang, E., Goh, K., Sychay, G., Wu, G.: CBSA: Content-Based Soft Annotation for Multimodal Image Retrieval Using Bayes Point Machine. IEEE Trans on Circuits and Systems for Video Technology 13(1) (2003)
Google Scholar
Guo, G., Li, S.Z.: Content-based audio classification and retrieval by support vector machines. IEEE Transactions on Neural Networks 14(1), 209–215 (2003)
Article Google Scholar
Fan, J., Elmagarmid, A.K., Zhu, X.q., Aref, W.G., Wu, L.: ClassView: hierarchical video shot classification, indexing, and accessing. Multimedia, IEEE Transactions 6(1), 70–86 (2004)
Article Google Scholar
Meinard, M., Tido, R., Michael, C.: Efficient Content-Based Retrieval of Motion Capture Data. In: Proceedings of ACM SIGGRAPH 2005 (2005)
Google Scholar
Wu, F., Yang, Y., Zhuang, Y., Pan, Y.: Understanding Multimedia Document Semantics for Cross-Media Retrieval. In: Ho, Y.-S., Kim, H.J. (eds.) PCM 2005. LNCS, vol. 3767, pp. 993–1004. Springer, Heidelberg (2005)
Chapter Google Scholar
Zhuang, Y., Wu, F., Zhang, H., Yang, Y.: Cross-Media Retrieval: Concepts, Advances and Challenges. In: 2006 International Symposium on Artificial Intelligence, Aug 1-3 (2006)
Google Scholar
Wu, F., Zhang, H., Zhuang, Y.: Learning Semantic Correlations for Cross-media Retrieval. In: The 13th International Conference on Image Processing (ICIP), Atlanta, GA, USA (2006)
Google Scholar
Zhang, C., Chen, X., Chen, M., Chen, S.-C., Shyu, M.-L.: A Multiple Instance Learning Approach for Content-based Image Retrieval Using One-class Support Vector Machine. In: IEEE International Conference on Multimedia & Expo, pp. 1142–1145 (2005)
Google Scholar
Maron, O., Ratan, A.L.: Multiple-Instance Learning for Natural Scene Classification. In: Koller, D., Fratkina, R. (eds.) Proceedings of the 15th International Conference on Machine Learning, pp. 341–349 (1998)
Google Scholar
Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis; An overview with application to learning methods. Technical Report CSD-TR-03-02, Computer Science Department, University of London (2003)
Google Scholar
Seung, H.S., Lee, D.: The manifold ways of perception. Science 290 (2000)
Google Scholar
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Advances in Neural Information Processing Systesms (2001)
Google Scholar
Zhao, X., Zhuang, Y., Wu, F.: Audio Clip Retrieval with Fast Relevance Feedback based on Constrained Fuzzy Clustering and Stored Index Table. In: The 3th IEEE Pacific-Rim Conference on Multimedia, pp. 237–244 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

The Institute of Artificial Intelligence, Zhejiang University, HangZhou, 310027, P.R. China
Hong Zhang & Jianguang Weng

Authors

Hong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jianguang Weng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Computer Science, Zhejiang University, China
Yueting Zhuang
Department of Computer Science and Technology, Tsinghua University, P.R. China
Shi-Qiang Yang
Microsoft Corporation, Microsoft China R&D Group, 49 Zhichun Road, 100080, Beijing, China
Yong Rui
College of Computer Science and Technology, Zhejiang University, 310027, Hangzhou, Zhejiang Province, China
Qinming He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, H., Weng, J. (2006). Measuring Multi-modality Similarities Via Subspace Learning for Cross-Media Retrieval. In: Zhuang, Y., Yang, SQ., Rui, Y., He, Q. (eds) Advances in Multimedia Information Processing - PCM 2006. PCM 2006. Lecture Notes in Computer Science, vol 4261. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11922162_111

Download citation

DOI: https://doi.org/10.1007/11922162_111
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-48766-1
Online ISBN: 978-3-540-48769-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics