Multi-feature Subspace Learning via Sparse Correlation Fusion and Embedding

Zhang, Hong; Zhang, Yanpeng

doi:10.1007/978-3-319-03731-8_55

Hong Zhang^22,23,24 &
Yanpeng Zhang²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8294))

Included in the following conference series:

Pacific-Rim Conference on Multimedia

2926 Accesses

Abstract

Subspace learning is most traditional and important in multimedia analysis. Numerous researches have focused on how to introduce machine learning and statistical methods to multimedia subspace learning for semantic understanding and denoising, and have gained remarkable achievement in different multimedia applications, such as content-based retrieval, data clustering, face recognition, etc. However, most of these researches are based on multimedia data of single modality. Nowadays, with the rapid development of multimedia and information technology, multimedia data of different modalities often coexist, and the presence of one has a complementary effect on the other to some extent. Because different multimedia data are usually represented with heterogeneous low-level features and there exists the well-known semantic gap, it is interesting and challenging to learn multimedia semantics by multi-feature subspace learning of different modalities. In this paper, we analyze sparse canonical correlation between feature matrices of different multimedia data, construct an isomorphic sparse multi-feature subspace; moreover, we propose subspace optimization strategy with correlation fusion, which explores both geometrical-based content correlation and graph-based semantic correlation. Our algorithm has been applied to content-based multimodal retrieval and data classification. Comprehensive experiments have demonstrated the superiority of our method over several existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Yang, Y., Nie, F., Xu, D., Luo, J., et al.: A Multimedia Retrieval Framework based on Semi-Supervised Ranking and Relevance Feedback. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 34(4), 723–742 (2012)
Article Google Scholar
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences and trends of the new age. ACM Computing Surveys 40(2) (2008)
Google Scholar
Zhang, R., Zhang, Z.: Effective Image Retrieval based on Hidden Concept Discovery in Image Database. IEEE Transactions on Image Processing 16(2), 562–572 (2007)
Article MathSciNet Google Scholar
Nie, F., Xu, D., Tsang, I., Zhang, C.: Spectral Embedded Clustering. In: International Joint Conference on Artificial Intelligence (IJCAI), California, pp. 1181–1186 (2009)
Google Scholar
Ye, J., Zhao, Z., Wu, M.: Discriminative k-means for clustering. Advances in Neural Information Processing Systems 20, 1649–1656 (2008)
Google Scholar
Liang, D.W., Liu, Y., Huang, Q.M., et al.: Video2Cartoon: Generating 3D Cartoon from Broad-cast Soccer Video. In: Proceedings of ACM Multimedia (2005)
Google Scholar
Typke, R., Wiering, F., Veltkamp, R.: A survey of music information retrieval systems. In: Proceedings of ISMIR, pp. 153–160 (2005)
Google Scholar
Lew, M., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: state-of-the-art and challenges. ACM Transactions on Multimedia Computing, Communication, and Applications 2(1), 1–19 (2006)
Article Google Scholar
Yang, Y., Xu, D., Nie, F., Luo, J.: Ranking with local regression and global alignment for cross-media retrieval. ACM Multimedia (2009)
Google Scholar
Zhang, H., Zhuang, Y., Wu, F.: Cross-modal correlation learning for clustering on image-audio dataset. ACM Multimedia (2007)
Google Scholar
Zhang, H., Meng, F.: Multi-modal Correlation Modeling and Ranking for Retrieval. In: IEEE Pacific-Rim Conference on Multimedia, pp. 637–646 (2009)
Google Scholar
Witten, D.M., Tibshirani, R.: Extensions of sparse canonical correlation analysis, with applications to genomic data. Statistical Applications in Genetics and Molecular Biology 8(1) (2009)
Google Scholar
Yang, Y., Zhuang, Y., Xu, D., Pan, Y., Tao, D., Maybank, S.: Retrieval Based Interactive Cartoon Synthesis via Unsupervised Bi-Distance Metric Learning. ACM Multimedia, 311–320 (2009)
Google Scholar
Turk, M.A., Pentland, A.P.: Face Recognition using Eigenface. In: Computer Vision and Pattern Recognition, pp. 586–591 (1991)
Google Scholar
Guo, G., Li, S.Z., Chan, K.: Face Recognition by Support Vector Machines. In: IEEE Intl. Conf. on Auto. Face and Gesture Recognition, pp. 196–201 (2000)
Google Scholar
McGurk, H., MacDonald, J.: Hearing Lips and Seeing Voices. Nature 264, 746–748 (1976)
Article Google Scholar
Zhang, H., Liu, Y., Ma, Z.: Fusing inherent and external knowledge with nonlinear learning for cross-media retrieval. Neurocomputing (2013), doi:10.1016/j.neucom.2012.03.033
Google Scholar
Ma, Q., Akiyo, N., Katsumi, T.: Complementary Information Retrieval for Cross-media News Content. In: Proceedings of Information Systems, vol. 31(7), pp. 659–678 (2006)
Google Scholar
Joliffe: Principal component analysis. Springer, New York (1986)
Google Scholar
He, X.F., Yan, S.C., Hu, Y.X., et al.: Face recognition using laplacianfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(3), 328–340 (2005)
Article Google Scholar
Tenenbaum, J.B., Silva, V.D., Langford, J.C.: A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science 290, 2319–2323 (2000)
Article Google Scholar
Hansen, L., Larsen, J., Kolenda, T.: On Independent Component Analysis for Multimedia Signals. In: Multimedia Image and Video Processing, pp. 175–200. CRC Press (2000)
Google Scholar
Guo, G., Li, S.Z.: Content-based Audio Classification and Retrieval by Support Vector Machines. IEEE Transactions on Neural Networks 14(1), 209–215 (2003)
Article Google Scholar
Slaney, M., Covell, M.: FaceSync: A linear operator for measuring synchronization of video facial images and audio tracks. In: NIPS, pp. 814–820 (2000)
Google Scholar
Belhumeur, P., Hespanha, J., Kriegman, D.: Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(7), 711–720 (1997)
Article Google Scholar
Zhang, H., Yu, J., Wang, M., Liu, Y.: Semi-supervised Distance Metric Learning based on Local Linear Regression for Data Clustering. Neurocomputing 93, 100–105 (2012)
Article Google Scholar
Lovasz, L., Plummer, M.: Matching Theory, pp. 307–349. Akadémiai Kiadó, North Holland (1986)
MATH Google Scholar
Cai, D., He, X., Han, J.: Semi-supervised Discriminant Analysis. In: IEEE 11th International Conference on Computer Vision, pp. 1–7 (2007)
Google Scholar
Ma, Z., Yang, Y., Nie, F., Uijlings, J., Sebe, N.: Exploiting the entire feature space with sparsity for automatic image annotation. In: Proceedings of the 19th ACM International Conference on Multimedia, pp. 283–292
Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science & Technology, Wuhan University of Science & Technology, China
Hong Zhang & Yanpeng Zhang
Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System, China
Hong Zhang
State Key Laboratory of Software Engineering, Wuhan University, 430072, China
Hong Zhang

Authors

Hong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yanpeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

EURECOM, Multimedia Department, Sophia Antipolis, France
Benoit Huet
Department of Computer Science, City University of Hong Kong, Tat Chee Ave, Kowloon, Hong Kong
Chong-Wah Ngo
Nanjing University of Science and Technology, 210093, Nanjing, China
Jinhui Tang
Department of Computer Science and Technology, Nanjing University, Xianlin Avenue No. 163, 210023, Nanjing, China
Zhi-Hua Zhou
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
Alexander G. Hauptmann
Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, 117583, Singapore, Singapore
Shuicheng Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, H., Zhang, Y. (2013). Multi-feature Subspace Learning via Sparse Correlation Fusion and Embedding. In: Huet, B., Ngo, CW., Tang, J., Zhou, ZH., Hauptmann, A.G., Yan, S. (eds) Advances in Multimedia Information Processing – PCM 2013. PCM 2013. Lecture Notes in Computer Science, vol 8294. Springer, Cham. https://doi.org/10.1007/978-3-319-03731-8_55

Download citation

DOI: https://doi.org/10.1007/978-3-319-03731-8_55
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03730-1
Online ISBN: 978-3-319-03731-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics