The artificial neural networks that are used to recognize shapes typically use one or more layers of learned feature detectors that produce scalar outputs. By contrast, the computer vision community uses complicated, hand-engineered features, like SIFT , that produce a whole vector of outputs including an explicit representation of the pose of the feature. We show how neural networks can be used to learn features that output a whole vector of instantiation parameters and we argue that this is a much more promising way of dealing with variations in position, orientation, scale and lighting than the methods currently employed in the neural networks community. It is also more promising than the hand-engineered features currently used in computer vision because it provides an efficient way of adapting the features to the domain.
KeywordsInvariance auto-encoder shape representation
Unable to display preview. Download preview PDF.
- 3.Hinton, G.E.: Shape representation in parallel systems. In: Proc. 7th International Joint Conference on Artificial Intelligence, vol. 2, pp. 1088–1096 (1981)Google Scholar
- 5.Lee, H., Grosse, R., Ranganath, R., Ng, A.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proc. 26th International Conference on Machine Learning (2009)Google Scholar
- 6.Lowe, D.G.: Object recognition from local scale-invariant features. In: Proc. International Conference on Computer Vision (1999)Google Scholar
- 8.Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proc. 27th International Conference on Machine Learning (2010)Google Scholar
- 10.Ranzato, M., Huang, F., Boureau, Y., LeCun, Y.: Unsupervised learning of invariant feature hierarchies with applications to object recognition. In: Proc. Computer Vision and Pattern Recognition Conference (CVPR 2007). IEEE Press, Los Alamitos (2007)Google Scholar
- 12.Zemel, R.S., Mozer, M.C., Hinton, G.E.: Traffic: Recognizing objects using hier-archical reference frame transformations. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems, pp. 266–273. Morgan Kauffman, San Mateo (1990)Google Scholar