Contingent Features for Reinforcement Learning

Sprague, Nathan

doi:10.1007/978-3-319-11179-7_44

Contingent Features for Reinforcement Learning

Nathan Sprague²¹

Conference paper

4271 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8681))

Abstract

Applying reinforcement learning algorithms in real-world domains is challenging because relevant state information is often embedded in a stream of high-dimensional sensor data. This paper describes a novel algorithm for learning task-relevant features through interactions with the environment. The key idea is that a feature is likely to be useful to the degree that its dynamics can be controlled by the actions of the agent. We describe an algorithm that can find such features and we demonstrate its effectiveness in an artificial domain.

Download to read the full chapter text

Chapter PDF

References

Bellemare, M.G., Veness, J., Bowling, M.: Investigating contingency awareness using Atari 2600 games. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (2012)
Google Scholar
Escalante-B, A.N., Wiskott, L.: Slow feature analysis: Perspectives for technical applications of a versatile learning algorithm. Künstliche Intelligenz 26(4), 341–348 (2012)
Article Google Scholar
Geramifard, A., Walsh, T., Roy, N., How, J.: Batch iFDD: A Scalable Matching Pursuit Algorithm for Solving MDPs. In: Proceedings of the 29th Annual Conference on Uncertainty in Artificial Intelligence (2013)
Google Scholar
Kolter, J., Ng, A.Y.: Regularization and feature selection in least-squares temporal difference learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 521–528. ACM (2009)
Google Scholar
Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. The Journal of Machine Learning Research 4, 1107–1149 (2003)
MathSciNet Google Scholar
Lange, S., Riedmiller, M.: Deep auto-encoder neural networks in reinforcement learning. In: Proceedings of the 2010 International Joint Conference on Neural Networks, pp. 1–8. IEEE (2010)
Google Scholar
Luciw, M., Schmidhuber, J.: Low complexity proto-value function learning from sensory observations with incremental slow feature analysis. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds.) ICANN 2012, Part II. LNCS, vol. 7553, pp. 279–287. Springer, Heidelberg (2012)
Chapter Google Scholar
Mahadevan, S., Giguere, S., Jacek, N.: Basis adaptation for sparse nonlinear reinforcement learning (2013)
Google Scholar
Mahadevan, S., Maggioni, M.: Proto-value functions: A laplacian framework for learning representation and control in markov decision processes. Journal of Machine Learning Research 8(16), 2169–2231 (2007)
MATH MathSciNet Google Scholar
Parr, R., Painter-Wakefield, C., Li, L., Littman, M.: Analyzing feature generation for value-function approximation. In: Proceedings of the 24th International Conference on Machine Learning (2007)
Google Scholar
Sprague, N.: Basis iteration for reward based dimensionality reduction. In: Proceedings of the 6th IEEE International Conference on Development and Learning (2007)
Google Scholar
Sprague, N.: Predictive projections. In: Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence (2009)
Google Scholar
Sprekeler, H.: On the relation of slow feature analysis and laplacian eigenmaps. Neural Computation 23(12), 3287–3302 (2011)
Article MATH MathSciNet Google Scholar
Wiskott, L., Sejnowski, T.J.: Slow feature analysis: Unsupervised learning of invariances. Neural Computation 14(4), 715–770 (2002)
Article MATH Google Scholar
Zito, T., Wilbert, N., Wiskott, L., Berkes, P.: Modular toolkit for data processing (MDP): a Python data processing frame work. Front. Neuroinform. 2(8) (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, James Madison University, Harrisonburg, VA, 22807, USA
Nathan Sprague

Authors

Nathan Sprague
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics, University of Hamburg, Vogt-Kölln-Straße 30, 22527, Hamburg, Germany
Stefan Wermter , Cornelius Weber & Sven Magg , &
Department of Informatics, Nicolaus Compernicus University, ul. Grudziądzka 5, 87-100, Torun, Poland
Włodzisław Duch
Department of Modern Languages, University of Helsinki, P.O. Box 24, 00014, Helsinki, Finland
Timo Honkela
Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Acad. G. Bonchev str. bl. 25A, 1113, Sofia, Bulgaria
Petia Koprinkova-Hristova
Institute of Neural Information Processing, University of Ulm, 89069, Oberer Eselsberg, Ulm, Germany
Günther Palm
Department of Information Systems, Quartier UNIL-Dorigny, Bâtiment Internef, University of Lausanne, 1015, Lausanne, Switzerland
Alessandro E. P. Villa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sprague, N. (2014). Contingent Features for Reinforcement Learning. In: Wermter, S., et al. Artificial Neural Networks and Machine Learning – ICANN 2014. ICANN 2014. Lecture Notes in Computer Science, vol 8681. Springer, Cham. https://doi.org/10.1007/978-3-319-11179-7_44

Download citation

DOI: https://doi.org/10.1007/978-3-319-11179-7_44
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11178-0
Online ISBN: 978-3-319-11179-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics