Sequential Constant Size Compressors for Reinforcement Learning

Gisslén, Linus; Luciw, Matt; Graziano, Vincent; Schmidhuber, Jürgen

doi:10.1007/978-3-642-22887-2_4

Linus Gisslén²²,
Matt Luciw²²,
Vincent Graziano²² &
…
Jürgen Schmidhuber²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6830))

Included in the following conference series:

International Conference on Artificial General Intelligence

2855 Accesses
8 Citations
3 Altmetric

Abstract

Traditional Reinforcement Learning methods are insufficient for AGIs who must be able to learn to deal with Partially Observable Markov Decision Processes. We investigate a novel method for dealing with this problem: standard RL techniques using as input the hidden layer output of a Sequential Constant-Size Compressor (SCSC). The SCSC takes the form of a sequential Recurrent Auto-Associative Memory, trained through standard back-propagation. Results illustrate the feasibility of this approach — this system learns to deal with high-dimensional visual observations (up to 640 pixels) in partially observable environments where there are long time lags (up to 12 steps) between relevant sensory information and necessary action.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Amari, S.: Natural gradient works efficiently in learning. Neural Computation 10(2), 251–276 (1998)
Article MathSciNet Google Scholar
Anderson, C.W.: Strategy learning with multilayer connectionist representations. Technical Report TR87-509.3, GTE Labs, Waltham, MA (1987)
Google Scholar
Bakker, B.: Reinforcement learning with Long Short-Term Memory. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, vol. 14. MIT Press, Cambridge (2002)
Google Scholar
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, SMC 13, 834–846 (1983)
Google Scholar
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. Neural Information Processing Systems (NIPS) (2007)
Google Scholar
Gomez, F.J., Miikkulainen, R.: Solving non-Markovian control tasks with neuroevolution. In: Proc. IJCAI 1999, Denver, CO. Morgan Kaufman, San Francisco (1999)
Google Scholar
Gomez, F.J., Schmidhuber, J., Miikkulainen, R.: Efficient non-linear control through neuroevolution. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 654–662. Springer, Heidelberg (2006)
Chapter Google Scholar
Gruau, F., Whitley, D., Pyeatt, L.: A comparison between cellular encoding and direct encoding for genetic neural networks. Technical Report NC-TR-96-048, NeuroCOLT (1996)
Google Scholar
Hansen, N., Ostermeier, A.: Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation 9(2), 159–195 (2001)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Computation 9(8), 1735–1780 (1997)
Article Google Scholar
Hutter, M.: Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer, Berlin (2004); (On J. Schmidhuber’s SNF grant 20-61847)
Google Scholar
Kolen, J.F., Pollack, J.B.: Back propagation is sensitive to initial conditions. Advances in neural information processing systems 3, 860–867 (1991)
Google Scholar
Lange, S., Riedmiller, M.: Deep Auto-Encoder Neural Networks in Reinforcement Learning. IJCNN (2010)
Google Scholar
Moriarty, D.E., Miikkulainen, R.: Efficient reinforcement learning through symbiotic evolution. Machine Learning 22, 11–32 (1996)
Google Scholar
Pollack, J.B.: Recursive distributed representations. Artificial Intelligence 46(1-2), 77–105 (1990)
Article Google Scholar
Saravanan, N., Fogel, D.B.: Evolving neural control systems. IEEE Expert, 23–27 (June 1995)
Google Scholar
Schaul, T., Glasmachers, T., Schmidhuber, J.: High dimensions and heavy tails for natural evolution strategies. In: Genetic and Evolutionary Computation Conference (GECCO) (2011)
Google Scholar
Schmidhuber, J.: A local learning algorithm for dynamic feedforward and recurrent networks. Connection Science 1(4), 403–412 (1989)
Article Google Scholar
Schmidhuber, J.: Recurrent networks adjusted by adaptive critics. In: Proc. IEEE/INNS International Joint Conference on Neural Networks, Washington, D. C, vol. 1, pp. 719–722 (1990)
Google Scholar
Schmidhuber, J.: Reinforcement learning in Markovian and non-Markovian environments. In: Lippman, D.S., Moody, J.E., Touretzky, D.S. (eds.) Advances in Neural Information Processing Systems (NIPS), vol. 3, pp. 500–506. Morgan Kaufmann, San Francisco (1991)
Google Scholar
Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evolutionary Computation 10, 99–127 (2002)
Article Google Scholar
Sutton, R., Barto, A.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998)
Google Scholar
Sutton, R.S., Mcallester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems, vol. 12, pp. 1057–1063. MIT Press, Cambridge (2000)
Google Scholar
Werbos, P.J.: Neural networks for control and system identification. In: Proceedings of IEEE/CDC Tampa, Florida (1989)
Google Scholar
Wierstra, D., Schmidhuber, J.: Policy gradient critics. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 466–477. Springer, Heidelberg (2007)
Chapter Google Scholar
Wierstra, D., Schaul, T., Peters, J., Schmidhuber, J.: Natural evolution strategies. In: Congress on Evolutionary Computation CEC (2008)
Google Scholar
Yao, X.: Xin Yao. A review of evolutionary artificial neural networks. International Journal of Intelligent Systems 4, 203–222 (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

IDSIA, University of Lugano, 6928, Manno-Lugano, Switzerland
Linus Gisslén, Matt Luciw, Vincent Graziano & Jürgen Schmidhuber

Authors

Linus Gisslén
View author publications
You can also search for this author in PubMed Google Scholar
Matt Luciw
View author publications
You can also search for this author in PubMed Google Scholar
Vincent Graziano
View author publications
You can also search for this author in PubMed Google Scholar
Jürgen Schmidhuber
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Istituto Dalle Molle di Studi sull’Intelligenza Artificiale (IDSIA), Lugano, Switzerland
Jürgen Schmidhuber
Reykjavik University, CADIA, Menntavegi 1, 101, Reykjavik, Iceland
Kristinn R. Thórisson
Google Research, 1600 Amphitheatre Parkway, Mountain View, 94043, CA, USA
Moshe Looks

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gisslén, L., Luciw, M., Graziano, V., Schmidhuber, J. (2011). Sequential Constant Size Compressors for Reinforcement Learning. In: Schmidhuber, J., Thórisson, K.R., Looks, M. (eds) Artificial General Intelligence. AGI 2011. Lecture Notes in Computer Science(), vol 6830. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22887-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-22887-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22886-5
Online ISBN: 978-3-642-22887-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics