Abstract
Multi-modal features are widely used to represent objects or events in pattern recognition and vision understanding. How to effectively integrate these heterogeneous features into a unified low-dimensional feature space has become a crucial issue in machine learning. In this work, we propose a novel approach which integrates heterogeneous features via an elaborate Semi-supervised Multi-Modal Deep Network (SMMDN). The proposed model first transforms the original data to high-level abstract homogeneous features. Then these homogeneous features are integrated into a new feature vector. By this means, our model can obtain abstract fused representations with lower-dimensionality and stronger discriminative ability. A Series of experiments are conducted on two object recognition datasets. Results show that our approach can integrate heterogeneous features effectively and achieve better performance compared to other methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cai, X., Nie, F., Cai, W., Huang, H.: Heterogeneous image features integration via multi-modal semi-supervised learning model. In: ICCV 2013, pp. 1737–1744. IEEE (2013)
Cai, X., Nie, F., Huang, H., Kamangar, F.: Heterogeneous image feature integration via multi-modal spectral clustering. In: CVPR 2011, pp. 1977–1984. IEEE (2011)
Chen, H., Cai, X., Zhu, D., Nie, F., Liu, T., Huang, H.: Group-wise consistent parcellation of gyri via adaptive multi-view spectral clustering of fiber shapes. In: Ayache, N., Delingette, H., Golland, P., Mori, K. (eds.) MICCAI 2012, Part II. LNCS, vol. 7511, pp. 271–279. Springer, Heidelberg (2012)
Cortes, C., Mohri, M., Rostamizadeh, A.: Algorithms for learning kernels based on centered alignment. J. Mach. Learn. Res. 13(1), 795–828 (2012)
Gönen, M., Alpaydin, E.: Localized multiple kernel learning. In: ICML 2008, pp. 352–359. ACM (2008)
Guillaumin, M., Verbeek, J., Schmid, C.: Multimodal semi-supervised learning for image classification. In: CVPR 2010, pp. 902–909. IEEE (2010)
Lin, Y.Y., Liu, T.L., Fuh, C.S.: Local ensemble kernel learning for object category recognition. In: CVPR 2007, pp. 1–8. IEEE (2007)
Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., Ng, A.Y.: Multimodal deep learning. In: ICML 2011, pp. 689–696 (2011)
Srivastava, N., Salakhutdinov, R.R.: Multimodal learning with deep boltzmann machines. In: NIPS 2012, pp. 2222–2230 (2012)
Subrahmanya, N., Shin, Y.C.: Sparse multiple kernel learning for signal processing applications. IEEE Trans. Pattern Anal. Mach. Intell. 32(5), 788–798 (2010)
Varma, M., Babu, B.R.: More generality in efficient multiple kernel learning. In: ICML 2009, pp. 1065–1072. ACM (2009)
Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A.: Multiple kernels for object detection. In: ICCV 2009, pp. 606–613. IEEE (2009)
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: ICML 2008, pp. 1096–1103. ACM (2008)
Acknowledgments
This work was supported in part by National Natural Foundation of China (No. 61222210) and 973 Program (2013CB329304). We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhao, L., Hu, Q., Zhou, Y. (2015). Heterogeneous Features Integration via Semi-supervised Multi-modal Deep Networks. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9492. Springer, Cham. https://doi.org/10.1007/978-3-319-26561-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-26561-2_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26560-5
Online ISBN: 978-3-319-26561-2
eBook Packages: Computer ScienceComputer Science (R0)