Advertisement

Improving Long- Term Online Prediction with Decoupled Extended Kalman Filters

  • Juan A. Pérez-Ortiz
  • Jürgen Schmidhuber
  • Felix A. Gers
  • Douglas Eck
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2415)

Abstract

Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) outperform traditional RNNs when dealing with sequences involving not only short-term but also long-term dependencies. The decoupled extended Kalman filter learning algorithm (DEKF) works well in online environments and reduces significantly the number of training steps when compared to the standard gradient-descent algorithms. Previous work on LSTM, however, has always used a form of gradient descent and has not focused on true online situations. Here we combine LSTM with DEKF and show that this new hybrid improves upon the original learning algorithm when applied to online processing.

Keywords

Gradient Descent Extended Kalman Filter Recurrent Neural Network Neural Computation Memory Block 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Elman, J. L.: Finding structure in time. Cognitive Science 14 (1990) 179–211.CrossRefGoogle Scholar
  2. 2.
    Gers, F. A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Computation 12,10 (2000) 2451–2471.CrossRefGoogle Scholar
  3. 3.
    Gers, F. A., Schmidhuber, J.: LSTM recurrent networks learn simple context free and context sensitive languages. IEEE Transactions on Neural Networks 12,6 (2001) 1333–1340.CrossRefGoogle Scholar
  4. 4.
    Haykin, S.: Neural networks: a comprehensive foundation. Prentice-Hall (1999).Google Scholar
  5. 5.
    Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. Kremer, S. C., Kolen, J. F. (eds.): A field guide to dynamical recurrent neural networks (2001). IEEE Press.Google Scholar
  6. 6.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Computation 9,8(1997) 1735–1780.CrossRefGoogle Scholar
  7. 7.
    Pearlmutter, B. A.: Gradient calculations for dynamic recurrent neural networks: a survey. IEEE Transactions on Neural Networks 6,5 (1995) 1212–1228.CrossRefGoogle Scholar
  8. 8.
    Puskorius, G. V., Feldkamp, L. A.: Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks. IEEE Transactions on Neural Networks 5,2 (1994) 279–297.CrossRefGoogle Scholar
  9. 9.
    Robinson, A. J., Fallside, F.: A recurrent error propagation speech recognition system. Computer Speech and Language 5 (1991) 259–274.CrossRefGoogle Scholar
  10. 10.
    Smith, A. W., Zipser, D.: Learning sequential structures with the real-time recurrent learning algorithm. Intl. Journal of Neural Systems 1,2 (1989) 125–131.CrossRefGoogle Scholar
  11. 11.
    Williams, R. J., Zipser, D.: A learning algorithm for continually training recurrent neural networks. Neural Computation 1 (1989) 270–280.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Juan A. Pérez-Ortiz
    • 1
  • Jürgen Schmidhuber
    • 2
  • Felix A. Gers
    • 3
  • Douglas Eck
    • 2
  1. 1.DLSIUniversitat d’AlacantAlacantSpain
  2. 2.IDSIAMannoSwitzerland
  3. 3.Mantik Bioinformatik GmbHBerlinGermany

Personalised recommendations