Dynamic Cortex Memory: Enhancing Recurrent Neural Networks for Gradient-Based Sequence Learning

Otte, Sebastian; Liwicki, Marcus; Zell, Andreas

doi:10.1007/978-3-319-11179-7_1

Sebastian Otte²¹,
Marcus Liwicki²² &
Andreas Zell²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8681))

Included in the following conference series:

International Conference on Artificial Neural Networks

4495 Accesses
14 Citations

Abstract

In this paper a novel recurrent neural network (RNN) model for gradient-based sequence learning is introduced. The presented dynamic cortex memory (DCM) is an extension of the well-known long short term memory (LSTM) model. The main innovation of the DCM is the enhancement of the inner interplay of the gates and the error carousel due to several new and trainable connections. These connections enable a direct signal transfer from the gates to one another. With this novel enhancement the networks are able to converge faster during training with back-propagation through time (BPTT) than LSTM under the same training conditions. Furthermore, DCMs yield better generalization results than LSTMs. This behaviour is shown for different supervised problem scenarios, including storing precise values, adding and learning a context-sensitive grammar.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bayer, J., Wierstra, D., Togelius, J., Schmidhuber, J.: Evolving memory cell structures for sequence learning. In: Alippi, C., Polycarpou, M., Panayiotou, C., Ellinas, G. (eds.) ICANN 2009, Part II. LNCS, vol. 5769, pp. 755–764. Springer, Heidelberg (2009)
Chapter Google Scholar
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: Continual prediction with LSTM. Neural Computation 12, 2451–2471 (1999)
Article Google Scholar
Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. Ph.D. thesis, Technische Universitaet Muenchen (2008)
Google Scholar
Graves, A., Jaitly, N., Mohamed, A.R.: Hybrid speech recognition with deep bidirectional lstm. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 273–278. IEEE (2013)
Google Scholar
Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(5), 855–868 (2009)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long Short-Term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: LSTM can solve hard long time lag problems. Advances in Neural Information Processing Systems 9, 473–479 (1997)
Google Scholar
Otte, S., Krechel, D., Liwicki, M.: JANNLab neural network framework for java. In: MLDM 2013, pp. 39–46. Ibai-Publishing, New York (2013)
Google Scholar
Otte, S., Otte, C., Schlaefer, A., Wittig, L., Hüttmann, G., Drömann, D., Zell, A.: A-Scan based lung tumor tissue classification with bidirectional long short term memory networks. In: 2013 IEEE International Workshop on Machine Learning for Signal Processing, MLSP (2013)
Google Scholar
Ul-Hasan, A., Breuel, T.M.: Can we build language-independent OCR using LSTM networks? In: Proceedings of the 4th International Workshop on Multilingual OCR, p. 9. ACM (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Cognitive Systems Group, University of Tübingen, Tübingen, Germany
Sebastian Otte & Andreas Zell
German Research Center for Artificial Intelligence, Kaiserslautern, Germany
Marcus Liwicki

Authors

Sebastian Otte
View author publications
You can also search for this author in PubMed Google Scholar
Marcus Liwicki
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Zell
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics, University of Hamburg, Vogt-Kölln-Straße 30, 22527, Hamburg, Germany
Stefan Wermter , Cornelius Weber & Sven Magg , &
Department of Informatics, Nicolaus Compernicus University, ul. Grudziądzka 5, 87-100, Torun, Poland
Włodzisław Duch
Department of Modern Languages, University of Helsinki, P.O. Box 24, 00014, Helsinki, Finland
Timo Honkela
Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Acad. G. Bonchev str. bl. 25A, 1113, Sofia, Bulgaria
Petia Koprinkova-Hristova
Institute of Neural Information Processing, University of Ulm, 89069, Oberer Eselsberg, Ulm, Germany
Günther Palm
Department of Information Systems, Quartier UNIL-Dorigny, Bâtiment Internef, University of Lausanne, 1015, Lausanne, Switzerland
Alessandro E. P. Villa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Otte, S., Liwicki, M., Zell, A. (2014). Dynamic Cortex Memory: Enhancing Recurrent Neural Networks for Gradient-Based Sequence Learning. In: Wermter, S., et al. Artificial Neural Networks and Machine Learning – ICANN 2014. ICANN 2014. Lecture Notes in Computer Science, vol 8681. Springer, Cham. https://doi.org/10.1007/978-3-319-11179-7_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-11179-7_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11178-0
Online ISBN: 978-3-319-11179-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics