Learning Sequence Neighbourhood Metrics

  • Justin Bayer
  • Christian Osendorfer
  • Patrick van der Smagt
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7552)


Recurrent neural networks (RNNs) in combination with a pooling operator and the neighbourhood components analysis (NCA) objective function are able to detect the characterizing dynamics of sequences and embed them into a fixed-length vector space of arbitrary dimensionality. Subsequently, the resulting features are meaningful and can be used for visualization or nearest neighbour classification in linear time. This kind of metric learning for sequential data enables the use of algorithms tailored towards fixed length vector spaces such as ℝ n .


Recurrent Neural Network Convolutional Neural Network Neural Information Processing System Handwriting Recognition Cross Entropy 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks 5, 157–166 (1994)CrossRefGoogle Scholar
  2. 2.
    Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Turian, J., Warde-Farley, D., Bengio, Y.: Theano: a CPU and GPU math expression compiler. In: Proceedings of the Python for Scientific Computing Conference (SciPy), Oral (June 2010)Google Scholar
  3. 3.
    Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. Journal of Machine Learning Research (2011) (to appear)Google Scholar
  4. 4.
    Keogh, E., Xi, X., Wei, L., Ratanamahatana, C.A.: The UCR time series classification/clustering homepage (2006)Google Scholar
  5. 5.
    Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: Advances in Neural Information Processing Systems 17, pp. 513–520. MIT Press (2004)Google Scholar
  6. 6.
    Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks 18, 602–610 (2005)CrossRefGoogle Scholar
  7. 7.
    Graves, A., Schmidhuber, J.: Offline handwriting recognition with multidimensional recurrent neural networks. In: Neural Information Processing Systems, pp. 545–552 (2009)Google Scholar
  8. 8.
    Hochreiter, S.: Untersuchungen zu dynamischen neuronalen netzen (1991)Google Scholar
  9. 9.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Computation 9, 1735–1780 (1997)CrossRefGoogle Scholar
  10. 10.
    Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Advances in Neural Information Processing Systems, vol. 11, pp. 487–493. MIT Press (1998)Google Scholar
  11. 11.
    Jaeger, H., Haas, H.: Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science 304, 78–80 (2004)CrossRefGoogle Scholar
  12. 12.
    Li, L., Aditya Prakash, B.: Time series clustering: Complex is simpler (2011)Google Scholar
  13. 13.
    Martens, J., Sutskever, I.: Learning recurrent neural networks with hessian-free optimization. In: Proceedings of the 28th International Conference on Machine Learning (2011)Google Scholar
  14. 14.
    Martens, J., Sutskever, I., Hinton, G.: Generating text with recurrent neural networks. In: Proceedings of the 28th International Conference on Machine Learning (2011)Google Scholar
  15. 15.
    Mozer, M.C.: A focused backpropagation algorithm for temporal pattern recognition (1989)Google Scholar
  16. 16.
    Salakhutdinov, R., Hinton, G.: Learning a nonlinear embedding by preserving class neighbourhood structure (2007)Google Scholar
  17. 17.
    van der Maaten, L.: Learning discriminative fisher kernels. In: Proceedings of the 28th International Conference on Machine Learning (2011)Google Scholar
  18. 18.
    Williams, R.J., Zipser, D.: Gradient-based learning algorithms for recurrent networks and their computational complexity (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Justin Bayer
    • 1
  • Christian Osendorfer
    • 1
  • Patrick van der Smagt
    • 2
  1. 1.Chair for Robotics and Embedded Systems, Insitut für InformatikTechnische Universität MünchenGermany
  2. 2.Institute of Robotics and MechatronicsDLR German Aerospace CenterGermany

Personalised recommendations