Abstract
We present an architecture based on self-organizing maps for learning a sensory layer in a learning system. The architecture, temporal network for transitions (TNT), enjoys the freedoms of unsupervised learning, works on-line, in non-episodic environments, is computationally light, and scales well. TNT generates a predictive model of its internal representation of the world, making planning methods available for both the exploitation and exploration of the environment. Experiments demonstrate that TNT learns nice representations of classical reinforcement learning mazes of varying size (up to 20×20) under conditions of high-noise and stochastic actions.
Keywords
- Self-Organizing Maps
- POMDPs
- Reinforcement Learning
Download conference paper PDF
References
Fernández, F., Borrajo, D.: Two steps reinforcement learning. International Journal of Intelligent Systems 23(2), 213–245 (2008)
Ferro, M., Ognibene, D., Pezzulo, G., Pirrelli, V.: Reading as active sensing: a computational model of gaze planning during word discrimination. Frontiers in Neurorobotics 4 (2010)
Fritzke, B.: A growing neural gas network learns topologies. In: Advances in Neural Information Processing Systems, vol. 7, pp. 625–632. MIT Press, Cambridge (1995)
Gisslén, L., Graziano, V., Luciw, M., Schmidhuber, J.: Sequential Constant Size Compressors and Reinforcement Learning. In: Proceedings of the Fourth Conference on Artificial General Intelligence (2011)
Kohonen, T.: Self-Organizing Maps, 3rd edn. Springer, Heidelberg (2001)
Koutník, J.: Inductive modelling of temporal sequences by means of self-organization. In: Proceeding of Internation Workshop on Inductive Modelling (IWIM 2007), pp. 269–277. CTU in Prague, Ljubljana (2007)
Koutník, J., Šnorek, M.: Temporal hebbian self-organizing map for sequences. In: ICANN 2006, vol. 1, pp. 632–641. Springer, Heidelberg (2008)
Lange, S., Riedmiller, M.: Deep auto-encoder neural networks in reinforcement learning. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (July 2010)
Luciw, M., Graziano, V., Ring, M., Schmidhuber, J.: Artificial Curiosity with Planning for Autonomous Perceptual and Cognitive Development. In: Proceedings of the International Conference on Development and Learning (2011)
Marsland, S., Shapiro, J., Nehmzow, U.: A self-organising network that grows when required. Neural Netw. 15 (October 2002)
Provost, J.: Reinforcement Learning in High-Diameter, Continuous Environments. Ph.D. thesis, Computer Sciences Department, University of Texas at Austin, Austin, TX (2007)
Provost, J., Kuipers, B.J., Miikkulainen, R.: Developing navigation behavior through self-organizing distinctive state abstraction. Connection Science 18 (2006)
Schmidhuber, J.: Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010). IEEE Transactions on Autonomous Mental Development 2(3), 230–247 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Graziano, V., Koutník, J., Schmidhuber, J. (2011). Unsupervised Modeling of Partially Observable Environments. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science(), vol 6911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23780-5_42
Download citation
DOI: https://doi.org/10.1007/978-3-642-23780-5_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23779-9
Online ISBN: 978-3-642-23780-5
eBook Packages: Computer ScienceComputer Science (R0)