Abstract
Long Short-Term Memory (LSTM) is one of the best recent supervised sequence learning methods. Using gradient descent, it trains memory cells represented as differentiable computational graph structures. Interestingly, LSTM’s cell structure seems somewhat arbitrary. In this paper we optimize its computational structure using a multi-objective evolutionary algorithm. The fitness function reflects the structure’s usefulness for learning various formal languages. The evolved cells help to understand crucial features that aid sequence learning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bakker, B., Linker, F., Schmidhuber, J.: Reinforcement learning in partially observable mobile robot domains using unsupervised event extraction. In: Proc. IROS 2002, pp. 938–943 (2002)
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Transactions on Evolutionary Computation 6, 182–197 (2002)
Gers, F.A., Schmidhuber, J.: LSTM recurrent networks learn simple context free and context sensitive languages. IEEE Transactions on Neural Networks 12, 1333–1340 (2001)
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: Continual prediction with LSTM. Neural Computation 12, 2451–2471 (2000)
Gers, F.A., Schraudolph, N.: Learning precise timing with LSTM recurrent networks. Journal of Machine Learning Research 3, 2002 (2002)
Gomez, F., Miikkulainen, R.: Incremental evolution of complex general behavior. Adaptive Behavior 5, 317–342 (1997)
Gruau, F.: Genetic synthesis of modular neural networks. In: Proceedings of the Fifth International Conference on Genetic Algorithms, pp. 318–325. Morgan Kaufmann, San Francisco (1993)
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kremer, S.C., Kolen, J.F. (eds.) A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press, Los Alamitos (2001)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Computation 9(8), 1735–1780 (1997)
Kitano, H.: Designing neural networks using genetic algorithms with graph generation system. Complex Systems 4, 461–476 (1990)
Liwicki, M., Graves, A., Bunke, H., Schmidhuber, J.: A novel approach to on-line handwriting recognition based on bidirectional long short-term memory networks. In: Proc. 9th Int. Conf. on Document Analysis and Recognition, vol. 1, pp. 367–371 (2007)
Rodriguez, P., Wiles, J.: Recurrent neural networks can learn to implement symbol-sensitive counting. In: NIPS 1997: Proceedings of the 1997 conference on Advances in neural information processing systems, vol. 10, pp. 87–93. MIT Press, Cambridge (1998)
Schmidhuber, J.: RNN overview (2004), http://www.idsia.ch/~juergen/rnn.html
Schmidhuber, J., Wierstra, D., Gagliolo, M., Gomez, F.: Training recurrent networks by evolino. Neural Computation 19(3), 757–779 (2007)
Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evolutionary Computation 10(2), 99–127 (2002)
Werbos, P.: Backpropagation through time: What it does and how to do it. Proceedings of the IEEE 78, 1550–1560 (1990)
Whiteson, S., Taylor, M.E., Stone, P.: Empirical studies in action selection with reinforcement learning. Adaptive Behavior 15, 33–50 (2007)
Wierstra, D., Foerster, A., Peters, J., Schmidhuber, J.: Solving deep memory pOMDPs with recurrent policy gradients. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D.P. (eds.) ICANN 2007. LNCS, vol. 4668, pp. 697–706. Springer, Heidelberg (2007)
Wiles, J., Elman, J.: Learning to count without a counter: A case study of dynamics and activation landscapes in recurrent networks. In: Proceedings of the Seventeenth Annual Conference of the Cognitive Science Society, pp. 482–487 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bayer, J., Wierstra, D., Togelius, J., Schmidhuber, J. (2009). Evolving Memory Cell Structures for Sequence Learning. In: Alippi, C., Polycarpou, M., Panayiotou, C., Ellinas, G. (eds) Artificial Neural Networks – ICANN 2009. ICANN 2009. Lecture Notes in Computer Science, vol 5769. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04277-5_76
Download citation
DOI: https://doi.org/10.1007/978-3-642-04277-5_76
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04276-8
Online ISBN: 978-3-642-04277-5
eBook Packages: Computer ScienceComputer Science (R0)