A Hybrid Learning Strategy for Discovery of Policies of Action

  • Richardson Ribeiro
  • Fabrício Enembreck
  • Alessandro L. Koerich
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4140)

Abstract

This paper presents a novel hybrid learning method and performance evaluation methodology for adaptive autonomous agents. Measuring the performance of a learning agent is not a trivial task and generally requires long simulations as well as knowledge about the domain. A generic evaluation methodology has been developed to precisely evaluate the performance of policy estimation techniques. This methodology has been integrated into a hybrid learning algorithm which aim is to decrease the learning time and the amount of errors of an adaptive agent. The hybrid learning method namely K-learning, integrates the Q-learning and K Nearest-Neighbors algorithm. Experiments show that the K-learning algorithm surpasses the Q-learning algorithm in terms of convergence speed to a good policy.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aha, D.W., Kibler, D., Albert, M.K.: Instance-based Learning Algorithms. Machine Learning 6(1), 37–66 (1991)Google Scholar
  2. 2.
    Almeida, A., Ramalho, G.L., Santana, H.P., Tedesco, P., Menezes, T.R., Corruble, V., Chevaleyre, Y.: Recent Advances on Multi-Agent Patrolling. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 474–483. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Bianchi, R.A.C., Ribeiro, C.H.C., Costa, A.H.R.: Heuristically Accelerated Q-learning: A New Approach to Speed Up Reinforcement Learning. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS, vol. 3171, pp. 245–254. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  4. 4.
    Downing, K.L.: Reinforced Genetic Programming. Genetic Programming and Evolvable Machines 2(3), 259–288 (2001)MATHCrossRefGoogle Scholar
  5. 5.
    Ernst, D., Geurts, P., Wehenke, L.: Tree-Based Batch Mode Reinforcement learning. Journal of Machine Learning Research 6, 503–556 (2005)Google Scholar
  6. 6.
    Figueiredo, K., Vellasco, M., Pacheco, M., Souza, M.: Reinforcement Learning Hierarchical Neuro-Fuzzy Politree Model for Control of Autonomous Agents. In: Fourth Int. Conference on Hybrid Intelligent Systems (HIS 2004), pp. 130–135 (2004)Google Scholar
  7. 7.
    Henderson, J., Lemon, O., Georgila, K.: Hybrid reinforcement/supervised learning for dialogue policies from COMMUNICATOR data. In: Proc. IJCAI workshop on Knowledge and Reasoning in Practical Dialogue Systems, Edinburgh (2005)Google Scholar
  8. 8.
    Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement Learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)Google Scholar
  9. 9.
    Levner, I., Bulitko, V., Madani, O., Greiner, R.: Performance of lookahead control policies in the face of abstractions and approximations. In: Koenig, S., Holte, R.C. (eds.) SARA 2002. LNCS, vol. 2371, pp. 299–307. Springer, Heidelberg (2002); Maes, P.: Artificial Life Meets Entertainment: Lifelike Autonomous Agents. Communications of ACM 38(11), 108-114 (1995)CrossRefGoogle Scholar
  10. 10.
    Mitchell, T.: Machine Learning. McGraw-Hill, Boston (1997)MATHGoogle Scholar
  11. 11.
    Ramon, J.: On the convergence of reinforcement learning using a decision tree learner. In: Proceedings of ICML 2005 Workshop on Rich Representation for Reinforcement Learning, Bonn, Germany (2005)Google Scholar
  12. 12.
    Ribeiro, C.H.C.: A Tutorial on Reinforcement Learning Techniques. In: Int. Joint Conference on Neuronal Networks. INNS Press, Washington (1999)Google Scholar
  13. 13.
    Russel, S., Norvig, P.: Inteligência Artificial, 2nd edn. Editora Elsevier, Rio de Janeiro (2004)Google Scholar
  14. 14.
    Ryan, M.R.K.: Hierarchical Reinforcement Learning: A Hybrid Approach. PhD Thesis, University of New South Wales, School of Computer Science and Engineering (2004)Google Scholar
  15. 15.
    Santana, H., Ramalho, G., Corruble, V., Ratitch, B.: Multi-Agent Patrolling with Reinforcement Learning. In: Proc. 3rd International Joint Conference on Autonomous Agents and Multi-Agents Systems (AAMAS 2004), pp. 1122–1129. ACM, New York (2004)Google Scholar
  16. 16.
    Siedlecki, W., Sklansky, J.: A note on Genetic Algorithms for Large-Scale Selection. Pattern Recognition Letters 10, 335–347 (1989)MATHCrossRefGoogle Scholar
  17. 17.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
  18. 18.
    Tesauro, G.: Temporal Difference Learning and TD-Gammon. Communications of the ACM 38(3), 58–68 (1995)CrossRefGoogle Scholar
  19. 19.
    Watkins, C.J.C.H., Dayan, P.: Q-learning, Machine Learning, 8th edn., pp. 279–292 (1992)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Richardson Ribeiro
    • 1
  • Fabrício Enembreck
    • 1
  • Alessandro L. Koerich
    • 1
  1. 1.Programa de Pós-Graduação em Informática Aplicada (PPGIA)Pontifícia Universidade Católica do ParanáCuritiba, ParanáBrasil

Personalised recommendations