Skip to main content

Dynamic Cortex Memory: Enhancing Recurrent Neural Networks for Gradient-Based Sequence Learning

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 8681)

Abstract

In this paper a novel recurrent neural network (RNN) model for gradient-based sequence learning is introduced. The presented dynamic cortex memory (DCM) is an extension of the well-known long short term memory (LSTM) model. The main innovation of the DCM is the enhancement of the inner interplay of the gates and the error carousel due to several new and trainable connections. These connections enable a direct signal transfer from the gates to one another. With this novel enhancement the networks are able to converge faster during training with back-propagation through time (BPTT) than LSTM under the same training conditions. Furthermore, DCMs yield better generalization results than LSTMs. This behaviour is shown for different supervised problem scenarios, including storing precise values, adding and learning a context-sensitive grammar.

Keywords

  • Dynamic Cortex Memory (DCM)
  • Recurrent Neural Networks (RNN)
  • Neural Networks
  • Long Short Term Memory (LSTM)

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-11179-7_1
  • Chapter length: 8 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-3-319-11179-7
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bayer, J., Wierstra, D., Togelius, J., Schmidhuber, J.: Evolving memory cell structures for sequence learning. In: Alippi, C., Polycarpou, M., Panayiotou, C., Ellinas, G. (eds.) ICANN 2009, Part II. LNCS, vol. 5769, pp. 755–764. Springer, Heidelberg (2009)

    CrossRef  Google Scholar 

  2. Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: Continual prediction with LSTM. Neural Computation 12, 2451–2471 (1999)

    CrossRef  Google Scholar 

  3. Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. Ph.D. thesis, Technische Universitaet Muenchen (2008)

    Google Scholar 

  4. Graves, A., Jaitly, N., Mohamed, A.R.: Hybrid speech recognition with deep bidirectional lstm. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 273–278. IEEE (2013)

    Google Scholar 

  5. Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(5), 855–868 (2009)

    CrossRef  Google Scholar 

  6. Hochreiter, S., Schmidhuber, J.: Long Short-Term memory. Neural Comput. 9(8), 1735–1780 (1997)

    CrossRef  Google Scholar 

  7. Hochreiter, S., Schmidhuber, J.: LSTM can solve hard long time lag problems. Advances in Neural Information Processing Systems 9, 473–479 (1997)

    Google Scholar 

  8. Otte, S., Krechel, D., Liwicki, M.: JANNLab neural network framework for java. In: MLDM 2013, pp. 39–46. Ibai-Publishing, New York (2013)

    Google Scholar 

  9. Otte, S., Otte, C., Schlaefer, A., Wittig, L., Hüttmann, G., Drömann, D., Zell, A.: A-Scan based lung tumor tissue classification with bidirectional long short term memory networks. In: 2013 IEEE International Workshop on Machine Learning for Signal Processing, MLSP (2013)

    Google Scholar 

  10. Ul-Hasan, A., Breuel, T.M.: Can we build language-independent OCR using LSTM networks? In: Proceedings of the 4th International Workshop on Multilingual OCR, p. 9. ACM (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Otte, S., Liwicki, M., Zell, A. (2014). Dynamic Cortex Memory: Enhancing Recurrent Neural Networks for Gradient-Based Sequence Learning. In: , et al. Artificial Neural Networks and Machine Learning – ICANN 2014. ICANN 2014. Lecture Notes in Computer Science, vol 8681. Springer, Cham. https://doi.org/10.1007/978-3-319-11179-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11179-7_1

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11178-0

  • Online ISBN: 978-3-319-11179-7

  • eBook Packages: Computer ScienceComputer Science (R0)