Projective Simulation for Classical Learning Agents: A Comprehensive Investigation

Mautner, Julian; Makmal, Adi; Manzano, Daniel; Tiersch, Markus; Briegel, Hans J.

doi:10.1007/s00354-015-0102-0

Projective Simulation for Classical Learning Agents: A Comprehensive Investigation

Published: 28 January 2015

Volume 33, pages 69–114, (2015)
Cite this article

New Generation Computing Aims and scope Submit manuscript

Julian Mautner^1,2,
Adi Makmal^1,2,
Daniel Manzano^1,2,3,
Markus Tiersch^1,2 &
…
Hans J. Briegel^1,2

380 Accesses
28 Citations
3 Altmetric
Explore all metrics

Abstract

We study the model of projective simulation (PS), a novel approach to artificial intelligence based on stochastic processing of episodic memory which was recently introduced. 2) Here we provide a detailed analysis of the model and examine its performance, including its achievable efficiency, its learning times and the way both properties scale with the problems’ dimension. In addition, we situate the PS agent in different learning scenarios, and study its learning abilities. A variety of new scenarios are being considered, thereby demonstrating the model’s flexibility. Furthermore, to put the PS scheme in context, we compare its performance with those of Q-learning and learning classifier systems, two popular models in the field of reinforcement learning. It is shown that PS is a competitive artificial intelligence model of unique properties and strengths.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GPT-3: Its Nature, Scope, Limits, and Consequences

Article Open access 01 November 2020

Luciano Floridi & Massimo Chiriatti

Embodied human language models vs. Large Language Models, or why Artificial Intelligence cannot explain the modal be able to

Article 07 February 2024

Sergio Torres-Martínez

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Sven Gronauer & Klaus Diepold

References

Adam, S., Busoniu, L. and Babuska, R., “Experience Replay for Real-Time Reinforcement Learning Control,” in Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 42, pp. 201–212, 2012.
Briegel, H. J. and De las Cuevas, G., “Projective simulation for artificial intel-Ligence,” in Sci. Rep. 2, 400, 2012.
Bull, L. and Kovacs, T. (Eds.), Foundations of Learning Classifier Systems, Studies in Fuzziness and Soft Computing, 183, Springer Berlin-Heidelberg, 2005.
Butz, M. V., Shirinov, E. and Reif, K. L., “Self-Organizing Sensorimotor Maps Plus Internal Motivations Yield Animal-Like Behavior,” in Adaptive Behavior, 18, pp. 315–337, 2010.
Butz, M. V. and Wilson, S. W., “An Algorithmic Description of XCS,” in Proc. IWLCS ’00 Revised Papers from the Third International Workshop on Advances in Learning Classifier Systems, pp. 253–272, Springer-Verlag London, U.K., 2001.
Dietterich, T. G., “Hierarchical reinforcement learning with the MAXQ value function decomposition,” in Journal of Artificial Intelligence Research, 13, pp. 227–303, 2000.
Floreano, D. and Mattiussi, C., Bio-inspired artificial intelligence: theories, methods, and technologies, Intelligent robotics and autonomous agents, MIT Press, Cambridge Massachusetts, 2008.
Holland J. H., Adaptation in Natural and Artificial Systems, University of Michigan Press, 1975.
Lin, L. J., “Self-improving reactive agents based on reinforcement learning, planning and teaching,” in Machine Learning 8, pp. 292–321, 1992.
Ormoneit, D. and Sen, S., “Kernel-based reinforcement learning,” in Machine Learning, 49, pp. 161178, 2002.
Pfeiffer R. and Scheier C. Understanding intelligence (First ed.). MIT Press, Cambridge Massachusetts, (1999)
Google Scholar
Poole, D., Mackworth, A. and Goebel R., Computational intelligence: A logical approach, Oxford University Press, 1998.
Parr, R. and Russell, S., “Reinforcement Learning with Hierarchies of Abstract Machines,” in Advances in Neural Information Processing Systems 10, pp. 1043–1049, MIT Press, 1997.
Russel, S. J. and Norvig, P., Artificial intelligence - A modern approach (Second ed.), Prentice Hall, New Jersey, 2003.
Sutton, R. S., Temporal Credit Assignment in Reinforcement Learning, PhD Thesis, University of Massachusetts at Amherst, 1984.
Sutton, R. S., “Integrated architectures for learning, planning, and reacting based on approximating dynamic programming,” in Proc. of the Seventh International Conference on Machine Learning, Morgan Kaufmann, pp. 216–224, 1990.
Sutton, R. S. and Barto, A. G., Reinforcement learning: An introduction (First edition), MIT Press, Cambridge Massachusetts, 1998.
Sutton, R. S., Precup, D. and Singh, S., “Between MDPs and semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning,” in Artificial Intelligence, 112, pp. 181–211, 1999.
Sutton, R. S., Szepesvari, C., Geramifard, A. and Bowling, M., “Dyna-style planning with linear function approximation and prioritized sweeping,” in Proc. of the 24th Conference on Uncertainty in Artificial Intelligence, pp. 528–536, 2008.
Toussaint, M., “A sensorimotor map: Modulating lateral interactions for anticipation and planning,” in Neural Computation 18, pp. 1132–1155, 2006.
Urbanowicz, R. J. and Moore, J. H., “Learning Classifier Systems: A Complete Introduction, Review, and Roadmap,” in Journal of Artificial Evolution and Applications, 2009, Article ID 736398, 2009. doi:10.1155/2009/736398.
Watkins, C. J. C. H., Learning from delayed rewards, PhD Thesis, University of Cambridge, England, 1989.
Watkins, C. J. C. H and Dayan P., “Q-learning” in Machine Learning 8, 279–292, 1992.
Wilson S. W., “Classifier Fitness Based on Accuracy,” in Evol. Comput. 3(2), pp. 149–175, 1995.

Download references

Author information

Authors and Affiliations

Institut für Theoretische Physik, Universität Innsbruck, Technikerstraße 25, A-6020, Innsbruck, Austria
Julian Mautner, Adi Makmal, Daniel Manzano, Markus Tiersch & Hans J. Briegel
Institut für Quantenoptik und Quanteninformation der Österreichischen Akademie der Wissenschaften, Innsbruck, Austria
Julian Mautner, Adi Makmal, Daniel Manzano, Markus Tiersch & Hans J. Briegel
Instituto Carlos I de Fisica Teórica y Computational, University of Granada, Granada, Spain
Daniel Manzano

Authors

Julian Mautner
View author publications
You can also search for this author in PubMed Google Scholar
Adi Makmal
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Manzano
View author publications
You can also search for this author in PubMed Google Scholar
Markus Tiersch
View author publications
You can also search for this author in PubMed Google Scholar
Hans J. Briegel
View author publications
You can also search for this author in PubMed Google Scholar

About this article

Cite this article

Mautner, J., Makmal, A., Manzano, D. et al. Projective Simulation for Classical Learning Agents: A Comprehensive Investigation. New Gener. Comput. 33, 69–114 (2015). https://doi.org/10.1007/s00354-015-0102-0

Download citation

Received: 21 October 2013
Revised: 27 March 2014
Published: 28 January 2015
Issue Date: January 2015
DOI: https://doi.org/10.1007/s00354-015-0102-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Projective Simulation for Classical Learning Agents: A Comprehensive Investigation

Abstract

Access this article

Similar content being viewed by others

GPT-3: Its Nature, Scope, Limits, and Consequences

Embodied human language models vs. Large Language Models, or why Artificial Intelligence cannot explain the modal be able to

Multi-agent deep reinforcement learning: a survey

References

Author information

Authors and Affiliations

About this article

Cite this article

Keywords

Navigation

Projective Simulation for Classical Learning Agents: A Comprehensive Investigation

Abstract

Access this article

Similar content being viewed by others

GPT-3: Its Nature, Scope, Limits, and Consequences

Embodied human language models vs. Large Language Models, or why Artificial Intelligence cannot explain the modal be able to

Multi-agent deep reinforcement learning: a survey

References

Author information

Authors and Affiliations

About this article

Cite this article

Share this article

Keywords

Search

Navigation