Skip to main content

Combining Learning Algorithms: An Approach to Markov Decision Processes

  • Conference paper
Enterprise Information Systems

Abstract

In this paper we present a technique for estimating policies which combines instance-based learning and reinforcement learning algorithms in Markovian environments. This approach has been developed for speeding up the convergence of adaptive intelligent agents that using reinforcement learning algorithms. Speeding up the learning of an intelligent agent is a complex task since the choice of inadequate updating techniques may cause delays in the learning process or even induce an unexpected acceleration that causes the agent to converge to a non-satisfactory policy. Experimental results in real-world scenarios have shown that the proposed technique is able to speed up the convergence of the agents while achieving optimal policies, overcoming problems of classical reinforcement learning approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 74.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 95.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3/4), 279–292 (1992)

    Article  Google Scholar 

  2. Ribeiro, C.H.C.: A tutorial on reinforcement learning techniques. In: Proceedings of International Joint Conference on Neural Networks, Washington, USA, pp. 59–61 (1999)

    Google Scholar 

  3. Tesauro, G.: Temporal difference learning and td-gammon. Commun. ACM 38(3), 58–68 (1995)

    Article  Google Scholar 

  4. Taylor, M., Stone, P.: Using imagery to simplify perceptual abstraction in reinforcement learning agents. J. Mach. Learn. Res. (JMLR) 10(1), 1633–1685 (2009)

    Google Scholar 

  5. Strehl, A.L., Li, L., Littman, M.L.: Reinforcement learning in finite mdps: Pac analysis. J. Mach. Learn. Res. (JMLR) 10, 2413–2444 (2009)

    Google Scholar 

  6. Stula, M., Stipanicev, D., Bodrozic, L.: Intelligent modeling with agent-based fuzzy cognitive map. Int. J. Intell. Syst. 25(24), 981–1004 (2010)

    Article  Google Scholar 

  7. Walsh, T.J., Goschin, S., Littman, M.L.: Integrating sample-based planning and model-based reinforcement learning. In: Proceedings of 14th Conference on Artificial Intelligence (AAAI’10), vol. 1 (2010)

    Google Scholar 

  8. Zhang, C., Lesser, V., Abdallah, S.: Self-organization for cordinating decentralized reinforcement learning. In: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems. AAMAS’10, International Foundation for Autonomous Agents and Multiagent Systems, pp. 739–746 (2010)

    Google Scholar 

  9. Wintermute, S.: Using imagery to simplify perceptual abstraction in reinforcement learning agents. In: Proceedings of 24th Conference on Artificial Intelligence (AAAI’10), Atlanta, Georgia, USA, pp. 1567–1573 (2010)

    Google Scholar 

  10. Price, B., Boutilier, C.: Accelerating reinforcement learning through implicit imitation. J. Artif. Intell. Res. 19, 569–629 (2003)

    Article  Google Scholar 

  11. Bianchi, R.A.C., Ribeiro, C.H.C., Costa, A.H.R.: Heuristically accelerated Q–learning: A new approach to speed up reinforcement learning. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 245–254. Springer, Heidelberg (2004)

    Google Scholar 

  12. Comanici, G., Precup, D.: Optimal policy switching algorithms for reinforcement learning. In: Proceedings of 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS’10), pp. 709–714 (2010)

    Google Scholar 

  13. Banerjee, B., Kraemer, L.: Action discovery for reinforcement learning. In: Proceedings of 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS’10), pp. 585–1586 (2010)

    Google Scholar 

  14. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  15. Ribeiro, R., Enembreck, F., Koerich, A.L.: A hybrid learning strategy for discovery of policies of action. In: Sichman, J.S., Coelho, H., Rezende, S.O. (eds.) IBERAMIA-SBIA 2006. LNCS (LNAI), vol. 4140, pp. 268–277. Springer, Heidelberg (2006)

    Google Scholar 

  16. Jordan, P.R., Schvartzman, L.J., Wellman, M.P.: Strategy exploration in empirical games. In: Proceedings of 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS’10),Toronto, Canada, vol. 1, pp. 1131–1138 (2010)

    Google Scholar 

  17. Amato, C., Shani, G.: High-level reinforcement learning in strategy games. In: Proceedings of 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS’10), pp. 75–82 (2010)

    Google Scholar 

  18. Spaan, M.T.J., Melo, F.S.: Interaction-driven markov games for decentralized multiagent planning under uncertainty. In: Proceedings of 7th International Conference on AAMAS, Estoril, Portugal, pp. 525–532 (2008)

    Google Scholar 

  19. Mohammadian, M.: Multi-agents systems for intelligent control of traffic signals. In: Proceedings of International Conference on Computational Inteligence for Modelling Control and Automation and International Conference on Intelligent Agents Web Technologies and International Commerce, Sydney, Australia, p. 270 (2006)

    Google Scholar 

  20. Le, T., Cai, C.: A new feature for approximate dynamic programming traffic light controller. In: Proceedings of 2th International Workshop on Computational Transportation Science (IWCTS’10), San Jose, CA, USA, pp. 29–34 (2010)

    Google Scholar 

  21. Sislak, D., Samek, J., Pechoucek, M.: Decentralized algorithms for collision avoidance in airspace. In: Proceedings of 7th International Conference on AAMAS, Estoril, Portugal, pp. 543–550 (2008)

    Google Scholar 

  22. Dimitrakiev, D., Nikolova, N., Tenekedjiev, K.: Simulation and discrete event optimization for automated decisions for in-queue flights. Int. J. Intell. Syst. 25(28), 460–487 (2010)

    Google Scholar 

  23. Firby, R.J.: Adaptive execution in complex dynamic worlds. Ph.D. thesis, Yale University (1989)

    Google Scholar 

  24. Pelta, D., Cruz, C., Gonzlez, J.: A study on diversity and cooperation in a multiagent strategy for dynamic optimization problems. Int. J. Intell. Syst. 24(18), 844–861 (2009)

    Article  Google Scholar 

  25. Drummond, C.: Accelerating reinforcement learning by composing solutions of automatically identified subtask. J. Artif. Intell. Res. 16, 59–104 (2002)

    Article  Google Scholar 

  26. Butz, M.: State value learning with an anticipatory learning classifier system in a markov decision process. Technical report, Illinois Genetic Algorithms Laboratory (2002)

    Book  Google Scholar 

  27. Koenig, S., Simmons, R.G.: The effect of representation and knowledge on goal-directed exploration with reinforcement learning algorithms. Mach. Learn. 22(1/3), 227–250 (1996)

    Article  Google Scholar 

  28. Bianchi, R.A.C., Ribeiro, C.H.C., Costa, A.H.R.: Accelerating autonomous learning by using heuristic selection of actions. J. Heuristics 14, 135–168 (2008)

    Article  Google Scholar 

  29. Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Trans. Pattern Analysis Mach. Intell. 20(3), 226–239 (1998)

    Article  Google Scholar 

  30. Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)

    Article  Google Scholar 

  31. Galvn, I., Valls, J., Garca, M., Isasi, P.: A lazy learning approach for building classification models. Int. J. Intell. Syst. 26(8), 773–786 (2011)

    Article  Google Scholar 

  32. Enembreck, F., Avila, B.C., Scalabrini, E.E., Barthes, J.P.A.: Learning drifting negotiations. Appl. Artif. Intell. 21, 861–881 (2007)

    Article  Google Scholar 

  33. Pegoraro, R., Costa, A.H.R., Ribeiro, C.H.C.: Experience generalization for multi-agent reinforcement learning. In: Proceedings of XXI International Conference of the Chilean Computer Science Society, Punta Arenas, Chile, pp. 233–239 (2001)

    Google Scholar 

  34. Ribeiro, R., Borges, A.P., Enembreck, F.: Interaction models for multiagent reinforcement learning. In: Proceedings of International Conferences on Computational Intelligence for Modelling, Control and Automation; Intelligent Agents, Web Technologies and Internet Commerce; and Innovation in Software Engineering, Vienna, Austria, pp. 464–469 (2008)

    Google Scholar 

  35. Ribeiro, R., Borges, A.P., Ronszcka, A.F., Scalabrin, E., Avila, B.C., Enembreck, F.: Combinando modelos de interao para melhorar a coordenao em sistemas multiagente. Revista de Informtica Terica e Aplicada 18, 133–157 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richardson Ribeiro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ribeiro, R., Favarim, F., Barbosa, M.A.C., Koerich, A.L., Enembreck, F. (2013). Combining Learning Algorithms: An Approach to Markov Decision Processes. In: Cordeiro, J., Maciaszek, L.A., Filipe, J. (eds) Enterprise Information Systems. Lecture Notes in Business Information Processing, vol 141. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40654-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40654-6_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40653-9

  • Online ISBN: 978-3-642-40654-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics