Skip to main content

Learning Context Sensitive Languages with LSTM Trained with Kalman Filters

  • Conference paper
  • First Online:
Artificial Neural Networks — ICANN 2002 (ICANN 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2415))

Included in the following conference series:

Abstract

Unlike traditional recurrent neural networks, the Long Short-Term Memory (LSTM) model generalizes well when presented with training sequences derived from regular and also simple nonregular languages. Our novel combination of LSTM and the decoupled extended Kalman filter, however, learns even faster and generalizes even better, requiring only the 10 shortest exemplars (n ≤ 10) of the context sensitive language anbncn to deal correctly with values of n up to 1000 and more. Even when we consider the relatively high update complexity per timestep, in many cases the hybrid offers faster learning than LSTM by itself.

Work supported by SNF grant 2100-49’144.96, Spanish Comisión Interministerial de Ciencia y Tecnología grant TIC2000-1599-C02-02, and Generalitat Valenciana grant FPI-99-14-268. J.R. Dorronsoro (Ed.): ICANN 2002, LNCS 2415, pp. 655-660, 2002.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Boden, M., Wiles, J.: Context-free and context-sensitive dynamics in recurrent neural networks. Connection Science 12,3 (2000).

    Google Scholar 

  2. Chalup, S., Blair, A.: Hill climbing in recurrent neural networks for learning the anbn nn language. Proc. 6th Conf. on Neural Information Processing (1999) 508–513.

    Google Scholar 

  3. Gers, F. A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Computation 12,10 (2000) 2451–2471.

    Article  Google Scholar 

  4. Gers, F. A., Schmidhuber, J.: LSTM recurrent networks learn simple context free and context sensitive languages. IEEE Transactions on Neural Networks 12,6 (2001) 1333–1340.

    Article  Google Scholar 

  5. Haykin, S. (ed.): Kalman filtering and neural networks. Wiley (2001).

    Google Scholar 

  6. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Computation 9,8 (1997) 1735–1780.

    Article  Google Scholar 

  7. Puskorius, G. V., Feldkamp, L. A.: Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks. IEEE Transactions on Neural Networks 5,2 (1994) 279–297.

    Article  Google Scholar 

  8. Rodriguez, P., Wiles, J., Elman, J.: A recurrent neural network that learns to count. Connection Science 11,1 (1999) 5–40.

    Article  Google Scholar 

  9. Rodriguez, P., Wiles, J.: Recurrent neural networks can learn to implement symbol-sensitive counting. Advances in Neural Information Processing Systems 10 (1998) 87–93. The MIT Press.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gers, F.A., Pérez-Ortiz, J.A., Eck, D., Schmidhuber, J. (2002). Learning Context Sensitive Languages with LSTM Trained with Kalman Filters. In: Dorronsoro, J.R. (eds) Artificial Neural Networks — ICANN 2002. ICANN 2002. Lecture Notes in Computer Science, vol 2415. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46084-5_107

Download citation

  • DOI: https://doi.org/10.1007/3-540-46084-5_107

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44074-1

  • Online ISBN: 978-3-540-46084-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics