Abstract
The availability of depth images provides a new possibility to solve the challenging object recognition problem. However, when there is not enough labeled data, we cannot learn a discriminative classifier even using depth information. To solve this problem, we extend LCCRRD method by kernel trick. First, we construct two RGB classifiers with all labeled RGB images from source and target domain. The significant samples for both classifier are boosted and the non-significant ones are inhibited by exploiting the relationship between two domains. In this process, the knowledge of source RGB classifier can be transferred to target RGB classifier effectively. Then to improve the performance of RGB-D classifier by applying the knowledge from source domain, the predicted results of RGB-D classifier are made consistent to target RGB classifier. Furthermore all the parameters are optimized in a unified objective function. Experiments on four cross-domain dataset pairs shows that our approach is indeed effective and promising.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Caltech-256/RGB-D: ball, calculator, box, mug, Flashlight, keyboard, light-bulb, mushroom, can, tomato, total 1132/1824 images. Caltech-256/B3DO: bottle, can, cup, keyboard, monitor, mouse, phone, spoon, total 776/1129 images. ImageNet/RGB-D: apple, banana, mug, keyboard, soda-can, water-bottle, plate, calculator, cereal-box, light-bulb, total 968/1823 images. ImageNet/B3DO: bottle, cup, keyboard, monitor, mouse, phone, plate, spoon, total 789/1135 images [8].
References
Lowe, D.G.: Object recognition from local scale-invariant features. In: IEEE International Conference on Computer Vision, pp. 1150–1157 (1999)
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)
Donahue, J., Jia, Y., Vinyals, O., et al.: DeCAF: a deep convolutional activation feature for generic visual recognition. In: International Conference on Machine Learning, vol. 32, pp. 647–655 (2014)
Bo, L., Ren, X., Fox, D.: Unsupervised feature learning for RGB-D based object recognition. In: Desai, J., Dudek, G., Khatib, O., Kumar, V. (eds.) Experimental Robotics. STAR, vol. 88, pp. 387–402. Springer, Heidelberg (2013). doi:10.1007/978-3-319-00065-7_27
Lai, K., Bo, L., Ren, X., Fox, D.: RGB-D object recognition: features, algorithms, and a large scale benchmark. In: Fossati, A., Gall, J., Grabner, H., Ren, X., Konolige, K. (eds.) Consumer Depth Cameras for Computer Vision. ACVPR, pp. 167–192. Springer, London (2013). doi:10.1007/978-1-4471-4640-7_9
Eitel, A., Springenberg, J.T., Spinello, L., et al.: Multimodal deep learning for robust RGB-D object recognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 681–687 (2015)
Okamoto, M., Nakayama, H.: Unsupervised visual domain adaptation using auxiliary information in target domain. In: IEEE International Symposium on Multimedia, pp. 203–206 (2014)
Li, X., Fang, M., Zhang, J.J., et al.: Learning coupled classifiers with RGB images for RGB-D object recognition. Pattern Recogn. 61, 433–446 (2017)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)
Weiss, K., Khoshgoftaar, T.M., Wang, D.D.: A survey of transfer learning. J. Big Data 3(1), 1–40 (2016)
Mika, S., Ratsch, G., Weston, J., et al.: Fisher discriminant analysis with kernels. In: IEEE Signal Processing Society Workshop, pp. 41–48 (1999)
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset (2007)
Deng, J., Dong, W., Socher, R., et al.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Lai, K., Bo, L., Ren, X., et al.: A large-scale hierarchical multi-view RGB-D object dataset. In: International Conference on Robotics and Automation, pp. 1817–1824 (2011)
Janoch, A., et al.: A category-level 3D object dataset: putting the kinect to work. In: Fossati, A., Gall, J., Grabner, H., Ren, X., Konolige, K. (eds.) Consumer Depth Cameras for Computer Vision. ACVPR, pp. 141–165. Springer, London (2013). doi:10.1007/978-1-4471-4640-7_8
Bo, L., Ren, X., Fox, D.: Multipath sparse coding using hierarchical matching pursuit. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 660–667 (2013)
Shawe, T.J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Acknowledgments
This research is supported by Natural Science Foundation of Heilongjiang Province, China (No. F201012) and National Science Foundation of China (No. 61370162, No. 61672190, No. 61671175).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Gao, D., Wu, R., Liu, J., Huang, Q., Tang, X., Liu, P. (2017). RGB-D Object Recognition Using the Knowledge Transferred from Relevant RGB Images. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10639. Springer, Cham. https://doi.org/10.1007/978-3-319-70136-3_68
Download citation
DOI: https://doi.org/10.1007/978-3-319-70136-3_68
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70135-6
Online ISBN: 978-3-319-70136-3
eBook Packages: Computer ScienceComputer Science (R0)