Abstract
Extracting good representations from images is essential for many computer vision tasks. While progress in deep learning shows the importance of learning hierarchical features, it is also important to learn features through multiple paths. This paper presents Multipath Convolutional-Recursive Neural Networks(M-CRNNs), a novel scheme which aims to learn image features from multiple paths using models based on combination of convolutional and recursive neural networks (CNNs and RNNs). CNNs learn low-level features, and RNNs, whose inputs are the outputs of the CNNs, learn the efficient high-level features. The final features of an image are the combination of the features from all the paths. The result shows that the features learned from M-CRNNs are a highly discriminative image representation that increases the precision in object recognition.
Chapter PDF
Similar content being viewed by others
References
Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, San Diego (2005)
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR, San Francisco (2010)
Bo, L., Ren, X., Fox, D.: Kernel descriptors for visual recognition. In: NIPS, Vancouver (2010)
Lobel, H., Vidal, R., Soto, A.: Hierarchical joint Max-Margin learning of mid and top level representations for visual recognition. In: ICCV, Sydney (2013)
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Transaction on Pattern Analysis and Machine Intelligence 35(8), 1798–1828 (2013)
Yu, K., Lin, Y., Lafferty, J.: Learning image representations from the pixel level via hierarchical sparse coding. In: CVPR, Colorado Springs (2011)
Le, Q., Ranzato, M., Monga, R., Devin, M., Chan, K., Gorrado, G., Dean, J., Ng, A.: Building high-level features using large scale unsupervised learning. In: ICML, Scotland (2012)
Lee, H., Grosse, R., Ranganath, R., Ng, A.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: ICML, Montreal (2009)
Lawrence, S., Giles, C., Tsoi, A., Back, D.: Face recognition: a convolutional neural-network approach. IEEE Transaction on Neural Networks 8(1), 98–113 (1997)
Socher, R., Huval, B., Bhat, B., Manning, D., Ng, A.: Convolutional-recursive deep learning for 3D object classification. In: NIPS, Nevada (2012)
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS, Nevada (2012)
Bo, L., Ren, X., Fox, D.: Multipath sparse coding using hierarchical matching pursuit. In: CVPR, Portland (2013)
Jarrett, K., Kavukcuoglu, K., Ranzato, M., LeCun, Y.: What is the best multi-stage architecture for object recognition. In: ICCV, Xi’an (2009)
Serre, T., Wolf, L., Poggio, T.: Object recognition with features Inspired by visual cortex. In: CVPR, San Diego (2005)
Kavukcuoglu, K., Ranzato, M., LeCun, Y.: Fast inference in sparse coding algorithm with applications to object recognition. Technical report, Computational and Biological Learning Lab, Courant Institute, NYU (2008)
Saxe, A., Koh, P., Chen, Z., Bhand, M., Suresh, B., Ng, A.: On random weights and unsupervised feature learning. In: ICML, Washington (2011)
Socher, R., Maning, C., Ng, A.: Learning continuous phrase representation and syntactic parsing with recursive neural networks. In: NIPS, Vancouver (2010)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR, New York (2006)
Pinto, N., Cox, D., DiCarlo, J.: Why is real-world visual object recognition hard. PLOS Computational Biology 4(1) (2008)
Coates, A., Ng, A.: The importance of encoding versus training with sparse coding and vector quantization. In: ICML, Washington (2011)
Zhang, H., Berg, A., MaireM.,Malik, J.: SVM-KNN: discriminative nearest classification for visual category recognition. In: CVPR, New York (2006)
Kavukcuoglu, K., Ranzato, M., Fergus, R., LeCun, Y.: Learning invariant features through topographic filter maps. In: CVPR, Florida (2009)
Zeiler, M., Krishnan, D., Taylor, G., Fergus, R.: Deconvolutional networks. In: CVPR, San Francisco (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 IFIP International Federation for Information Processing
About this paper
Cite this paper
Li, X., Jiang, S., Song, X., Herranz, L., Shi, Z. (2014). Multipath Convolutional-Recursive Neural Networks for Object Recognition. In: Shi, Z., Wu, Z., Leake, D., Sattler, U. (eds) Intelligent Information Processing VII. IIP 2014. IFIP Advances in Information and Communication Technology, vol 432. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44980-6_30
Download citation
DOI: https://doi.org/10.1007/978-3-662-44980-6_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44979-0
Online ISBN: 978-3-662-44980-6
eBook Packages: Computer ScienceComputer Science (R0)