Advertisement

New Generation Computing

, Volume 33, Issue 1, pp 69–114 | Cite as

Projective Simulation for Classical Learning Agents: A Comprehensive Investigation

  • Julian Mautner
  • Adi Makmal
  • Daniel Manzano
  • Markus Tiersch
  • Hans J. Briegel
Article

Abstract

We study the model of projective simulation (PS), a novel approach to artificial intelligence based on stochastic processing of episodic memory which was recently introduced. 2) Here we provide a detailed analysis of the model and examine its performance, including its achievable efficiency, its learning times and the way both properties scale with the problems’ dimension. In addition, we situate the PS agent in different learning scenarios, and study its learning abilities. A variety of new scenarios are being considered, thereby demonstrating the model’s flexibility. Furthermore, to put the PS scheme in context, we compare its performance with those of Q-learning and learning classifier systems, two popular models in the field of reinforcement learning. It is shown that PS is a competitive artificial intelligence model of unique properties and strengths.

Keywords

Artificial Intelligence Reinforcement Learning Embodied Agent Projective Simulation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Adam, S., Busoniu, L. and Babuska, R., “Experience Replay for Real-Time Reinforcement Learning Control,” in Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 42, pp. 201–212, 2012.Google Scholar
  2. 2.
    Briegel, H. J. and De las Cuevas, G., “Projective simulation for artificial intel-Ligence,” in Sci. Rep. 2, 400, 2012.Google Scholar
  3. 3.
    Bull, L. and Kovacs, T. (Eds.), Foundations of Learning Classifier Systems, Studies in Fuzziness and Soft Computing, 183, Springer Berlin-Heidelberg, 2005.Google Scholar
  4. 4.
    Butz, M. V., Shirinov, E. and Reif, K. L., “Self-Organizing Sensorimotor Maps Plus Internal Motivations Yield Animal-Like Behavior,” in Adaptive Behavior, 18, pp. 315–337, 2010.Google Scholar
  5. 5.
    Butz, M. V. and Wilson, S. W., “An Algorithmic Description of XCS,” in Proc. IWLCS ’00 Revised Papers from the Third International Workshop on Advances in Learning Classifier Systems, pp. 253–272, Springer-Verlag London, U.K., 2001.Google Scholar
  6. 6.
    Dietterich, T. G., “Hierarchical reinforcement learning with the MAXQ value function decomposition,” in Journal of Artificial Intelligence Research, 13, pp. 227–303, 2000.Google Scholar
  7. 7.
    Floreano, D. and Mattiussi, C., Bio-inspired artificial intelligence: theories, methods, and technologies, Intelligent robotics and autonomous agents, MIT Press, Cambridge Massachusetts, 2008.Google Scholar
  8. 8.
    Holland J. H., Adaptation in Natural and Artificial Systems, University of Michigan Press, 1975.Google Scholar
  9. 9.
    Lin, L. J., “Self-improving reactive agents based on reinforcement learning, planning and teaching,” in Machine Learning 8, pp. 292–321, 1992.Google Scholar
  10. 10.
    Ormoneit, D. and Sen, S., “Kernel-based reinforcement learning,” in Machine Learning, 49, pp. 161178, 2002.Google Scholar
  11. 11.
    Pfeiffer R. and Scheier C. Understanding intelligence (First ed.). MIT Press, Cambridge Massachusetts, (1999)Google Scholar
  12. 12.
    Poole, D., Mackworth, A. and Goebel R., Computational intelligence: A logical approach, Oxford University Press, 1998.Google Scholar
  13. 13.
    Parr, R. and Russell, S., “Reinforcement Learning with Hierarchies of Abstract Machines,” in Advances in Neural Information Processing Systems 10, pp. 1043–1049, MIT Press, 1997.Google Scholar
  14. 14.
    Russel, S. J. and Norvig, P., Artificial intelligence - A modern approach (Second ed.), Prentice Hall, New Jersey, 2003.Google Scholar
  15. 15.
    Sutton, R. S., Temporal Credit Assignment in Reinforcement Learning, PhD Thesis, University of Massachusetts at Amherst, 1984.Google Scholar
  16. 16.
    Sutton, R. S., “Integrated architectures for learning, planning, and reacting based on approximating dynamic programming,” in Proc. of the Seventh International Conference on Machine Learning, Morgan Kaufmann, pp. 216–224, 1990.Google Scholar
  17. 17.
    Sutton, R. S. and Barto, A. G., Reinforcement learning: An introduction (First edition), MIT Press, Cambridge Massachusetts, 1998.Google Scholar
  18. 18.
    Sutton, R. S., Precup, D. and Singh, S., “Between MDPs and semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning,” in Artificial Intelligence, 112, pp. 181–211, 1999.Google Scholar
  19. 19.
    Sutton, R. S., Szepesvari, C., Geramifard, A. and Bowling, M., “Dyna-style planning with linear function approximation and prioritized sweeping,” in Proc. of the 24th Conference on Uncertainty in Artificial Intelligence, pp. 528–536, 2008.Google Scholar
  20. 20.
    Toussaint, M., “A sensorimotor map: Modulating lateral interactions for anticipation and planning,” in Neural Computation 18, pp. 1132–1155, 2006.Google Scholar
  21. 21.
    Urbanowicz, R. J. and Moore, J. H., “Learning Classifier Systems: A Complete Introduction, Review, and Roadmap,” in Journal of Artificial Evolution and Applications, 2009, Article ID 736398, 2009. doi: 10.1155/2009/736398.
  22. 22.
    Watkins, C. J. C. H., Learning from delayed rewards, PhD Thesis, University of Cambridge, England, 1989.Google Scholar
  23. 23.
    Watkins, C. J. C. H and Dayan P., “Q-learning” in Machine Learning 8, 279–292, 1992.Google Scholar
  24. 24.
    Wilson S. W., “Classifier Fitness Based on Accuracy,” in Evol. Comput. 3(2), pp. 149–175, 1995.Google Scholar

Copyright information

© Ohmsha and Springer Japan 2015

Authors and Affiliations

  • Julian Mautner
    • 1
    • 2
  • Adi Makmal
    • 1
    • 2
  • Daniel Manzano
    • 1
    • 2
    • 3
  • Markus Tiersch
    • 1
    • 2
  • Hans J. Briegel
    • 1
    • 2
  1. 1.Institut für Theoretische PhysikUniversität InnsbruckInnsbruckAustria
  2. 2.Institut für Quantenoptik und Quanteninformation der Österreichischen Akademie der WissenschaftenInnsbruckAustria
  3. 3.Instituto Carlos I de Fisica Teórica y ComputationalUniversity of GranadaGranadaSpain

Personalised recommendations