Skip to main content

Transforming Auto-Encoders

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 6791)

Abstract

The artificial neural networks that are used to recognize shapes typically use one or more layers of learned feature detectors that produce scalar outputs. By contrast, the computer vision community uses complicated, hand-engineered features, like SIFT [6], that produce a whole vector of outputs including an explicit representation of the pose of the feature. We show how neural networks can be used to learn features that output a whole vector of instantiation parameters and we argue that this is a much more promising way of dealing with variations in position, orientation, scale and lighting than the methods currently employed in the neural networks community. It is also more promising than the hand-engineered features currently used in computer vision because it provides an efficient way of adapting the features to the domain.

Keywords

  • Invariance
  • auto-encoder
  • shape representation

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-642-21735-7_6
  • Chapter length: 8 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   79.99
Price excludes VAT (USA)
  • ISBN: 978-3-642-21735-7
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   99.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berkes, P., Turner, R.E., Sahani, M.: A structured model of video reproduces primary visual cortical organisation. PLoS Computational Biology 5(9), 1–16 (2009)

    CrossRef  MathSciNet  Google Scholar 

  2. Freeman, W., Adelson, E.: The design and use of steerable filters. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(9), 891–906 (1991)

    CrossRef  Google Scholar 

  3. Hinton, G.E.: Shape representation in parallel systems. In: Proc. 7th International Joint Conference on Artificial Intelligence, vol. 2, pp. 1088–1096 (1981)

    Google Scholar 

  4. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)

    CrossRef  Google Scholar 

  5. Lee, H., Grosse, R., Ranganath, R., Ng, A.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proc. 26th International Conference on Machine Learning (2009)

    Google Scholar 

  6. Lowe, D.G.: Object recognition from local scale-invariant features. In: Proc. International Conference on Computer Vision (1999)

    Google Scholar 

  7. Memisevic, R., Hinton, G.: Learning to represent spatial transformations with factored higher-order boltzmann machines. Neural Comp. 22, 1473–1492 (2010)

    CrossRef  MATH  Google Scholar 

  8. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proc. 27th International Conference on Machine Learning (2010)

    Google Scholar 

  9. Pelli, D.G., Tillman, K.A.: The uncrowded window of object recognition. Nature Neuroscience 11, 1129–1135 (2008)

    CrossRef  Google Scholar 

  10. Ranzato, M., Huang, F., Boureau, Y., LeCun, Y.: Unsupervised learning of invariant feature hierarchies with applications to object recognition. In: Proc. Computer Vision and Pattern Recognition Conference (CVPR 2007). IEEE Press, Los Alamitos (2007)

    Google Scholar 

  11. Riesenhuber, M., Poggio, T.: Hierarchical models of object recognition in cortex. Nature Neuroscience 2, 1019–1025 (1999)

    CrossRef  Google Scholar 

  12. Zemel, R.S., Mozer, M.C., Hinton, G.E.: Traffic: Recognizing objects using hier-archical reference frame transformations. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems, pp. 266–273. Morgan Kauffman, San Mateo (1990)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hinton, G.E., Krizhevsky, A., Wang, S.D. (2011). Transforming Auto-Encoders. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2011. ICANN 2011. Lecture Notes in Computer Science, vol 6791. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21735-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21735-7_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21734-0

  • Online ISBN: 978-3-642-21735-7

  • eBook Packages: Computer ScienceComputer Science (R0)