Combining Learning Algorithms: An Approach to Markov Decision Processes

Ribeiro, Richardson; Favarim, Fábio; Barbosa, Marco A. C.; Koerich, Alessandro L.; Enembreck, Fabrício

doi:10.1007/978-3-642-40654-6_11

Richardson Ribeiro¹⁰,
Fábio Favarim¹⁰,
Marco A. C. Barbosa¹⁰,
Alessandro L. Koerich¹¹ &
…
Fabrício Enembreck¹¹

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 141))

903 Accesses
1 Citations

Abstract

In this paper we present a technique for estimating policies which combines instance-based learning and reinforcement learning algorithms in Markovian environments. This approach has been developed for speeding up the convergence of adaptive intelligent agents that using reinforcement learning algorithms. Speeding up the learning of an intelligent agent is a complex task since the choice of inadequate updating techniques may cause delays in the learning process or even induce an unexpected acceleration that causes the agent to converge to a non-satisfactory policy. Experimental results in real-world scenarios have shown that the proposed technique is able to speed up the convergence of the agents while achieving optimal policies, overcoming problems of classical reinforcement learning approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Softcover Book: USD 95.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3/4), 279–292 (1992)
Article Google Scholar
Ribeiro, C.H.C.: A tutorial on reinforcement learning techniques. In: Proceedings of International Joint Conference on Neural Networks, Washington, USA, pp. 59–61 (1999)
Google Scholar
Tesauro, G.: Temporal difference learning and td-gammon. Commun. ACM 38(3), 58–68 (1995)
Article Google Scholar
Taylor, M., Stone, P.: Using imagery to simplify perceptual abstraction in reinforcement learning agents. J. Mach. Learn. Res. (JMLR) 10(1), 1633–1685 (2009)
Google Scholar
Strehl, A.L., Li, L., Littman, M.L.: Reinforcement learning in finite mdps: Pac analysis. J. Mach. Learn. Res. (JMLR) 10, 2413–2444 (2009)
Google Scholar
Stula, M., Stipanicev, D., Bodrozic, L.: Intelligent modeling with agent-based fuzzy cognitive map. Int. J. Intell. Syst. 25(24), 981–1004 (2010)
Article Google Scholar
Walsh, T.J., Goschin, S., Littman, M.L.: Integrating sample-based planning and model-based reinforcement learning. In: Proceedings of 14th Conference on Artificial Intelligence (AAAI’10), vol. 1 (2010)
Google Scholar
Zhang, C., Lesser, V., Abdallah, S.: Self-organization for cordinating decentralized reinforcement learning. In: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems. AAMAS’10, International Foundation for Autonomous Agents and Multiagent Systems, pp. 739–746 (2010)
Google Scholar
Wintermute, S.: Using imagery to simplify perceptual abstraction in reinforcement learning agents. In: Proceedings of 24th Conference on Artificial Intelligence (AAAI’10), Atlanta, Georgia, USA, pp. 1567–1573 (2010)
Google Scholar
Price, B., Boutilier, C.: Accelerating reinforcement learning through implicit imitation. J. Artif. Intell. Res. 19, 569–629 (2003)
Article Google Scholar
Bianchi, R.A.C., Ribeiro, C.H.C., Costa, A.H.R.: Heuristically accelerated Q–learning: A new approach to speed up reinforcement learning. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 245–254. Springer, Heidelberg (2004)
Google Scholar
Comanici, G., Precup, D.: Optimal policy switching algorithms for reinforcement learning. In: Proceedings of 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS’10), pp. 709–714 (2010)
Google Scholar
Banerjee, B., Kraemer, L.: Action discovery for reinforcement learning. In: Proceedings of 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS’10), pp. 585–1586 (2010)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Ribeiro, R., Enembreck, F., Koerich, A.L.: A hybrid learning strategy for discovery of policies of action. In: Sichman, J.S., Coelho, H., Rezende, S.O. (eds.) IBERAMIA-SBIA 2006. LNCS (LNAI), vol. 4140, pp. 268–277. Springer, Heidelberg (2006)
Google Scholar
Jordan, P.R., Schvartzman, L.J., Wellman, M.P.: Strategy exploration in empirical games. In: Proceedings of 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS’10),Toronto, Canada, vol. 1, pp. 1131–1138 (2010)
Google Scholar
Amato, C., Shani, G.: High-level reinforcement learning in strategy games. In: Proceedings of 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS’10), pp. 75–82 (2010)
Google Scholar
Spaan, M.T.J., Melo, F.S.: Interaction-driven markov games for decentralized multiagent planning under uncertainty. In: Proceedings of 7th International Conference on AAMAS, Estoril, Portugal, pp. 525–532 (2008)
Google Scholar
Mohammadian, M.: Multi-agents systems for intelligent control of traffic signals. In: Proceedings of International Conference on Computational Inteligence for Modelling Control and Automation and International Conference on Intelligent Agents Web Technologies and International Commerce, Sydney, Australia, p. 270 (2006)
Google Scholar
Le, T., Cai, C.: A new feature for approximate dynamic programming traffic light controller. In: Proceedings of 2th International Workshop on Computational Transportation Science (IWCTS’10), San Jose, CA, USA, pp. 29–34 (2010)
Google Scholar
Sislak, D., Samek, J., Pechoucek, M.: Decentralized algorithms for collision avoidance in airspace. In: Proceedings of 7th International Conference on AAMAS, Estoril, Portugal, pp. 543–550 (2008)
Google Scholar
Dimitrakiev, D., Nikolova, N., Tenekedjiev, K.: Simulation and discrete event optimization for automated decisions for in-queue flights. Int. J. Intell. Syst. 25(28), 460–487 (2010)
Google Scholar
Firby, R.J.: Adaptive execution in complex dynamic worlds. Ph.D. thesis, Yale University (1989)
Google Scholar
Pelta, D., Cruz, C., Gonzlez, J.: A study on diversity and cooperation in a multiagent strategy for dynamic optimization problems. Int. J. Intell. Syst. 24(18), 844–861 (2009)
Article Google Scholar
Drummond, C.: Accelerating reinforcement learning by composing solutions of automatically identified subtask. J. Artif. Intell. Res. 16, 59–104 (2002)
Article Google Scholar
Butz, M.: State value learning with an anticipatory learning classifier system in a markov decision process. Technical report, Illinois Genetic Algorithms Laboratory (2002)
Book Google Scholar
Koenig, S., Simmons, R.G.: The effect of representation and knowledge on goal-directed exploration with reinforcement learning algorithms. Mach. Learn. 22(1/3), 227–250 (1996)
Article Google Scholar
Bianchi, R.A.C., Ribeiro, C.H.C., Costa, A.H.R.: Accelerating autonomous learning by using heuristic selection of actions. J. Heuristics 14, 135–168 (2008)
Article Google Scholar
Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Trans. Pattern Analysis Mach. Intell. 20(3), 226–239 (1998)
Article Google Scholar
Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)
Article Google Scholar
Galvn, I., Valls, J., Garca, M., Isasi, P.: A lazy learning approach for building classification models. Int. J. Intell. Syst. 26(8), 773–786 (2011)
Article Google Scholar
Enembreck, F., Avila, B.C., Scalabrini, E.E., Barthes, J.P.A.: Learning drifting negotiations. Appl. Artif. Intell. 21, 861–881 (2007)
Article Google Scholar
Pegoraro, R., Costa, A.H.R., Ribeiro, C.H.C.: Experience generalization for multi-agent reinforcement learning. In: Proceedings of XXI International Conference of the Chilean Computer Science Society, Punta Arenas, Chile, pp. 233–239 (2001)
Google Scholar
Ribeiro, R., Borges, A.P., Enembreck, F.: Interaction models for multiagent reinforcement learning. In: Proceedings of International Conferences on Computational Intelligence for Modelling, Control and Automation; Intelligent Agents, Web Technologies and Internet Commerce; and Innovation in Software Engineering, Vienna, Austria, pp. 464–469 (2008)
Google Scholar
Ribeiro, R., Borges, A.P., Ronszcka, A.F., Scalabrin, E., Avila, B.C., Enembreck, F.: Combinando modelos de interao para melhorar a coordenao em sistemas multiagente. Revista de Informtica Terica e Aplicada 18, 133–157 (2011)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Graduate Program in Computer Engineering, Federal Technological University of Paraná, Pato Branco, Paraná, Brazil
Richardson Ribeiro, Fábio Favarim & Marco A. C. Barbosa
Post-Graduate Program in Computer Science, Pontificial Catholical University of Paraná, Curitiba, Paraná, Brazil
Alessandro L. Koerich & Fabrício Enembreck

Authors

Richardson Ribeiro
View author publications
You can also search for this author in PubMed Google Scholar
Fábio Favarim
View author publications
You can also search for this author in PubMed Google Scholar
Marco A. C. Barbosa
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro L. Koerich
View author publications
You can also search for this author in PubMed Google Scholar
Fabrício Enembreck
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Richardson Ribeiro .

Editor information

Editors and Affiliations

Institute for Systems and Technologies of Information, Control and Communication, Setúbal, Portugal
José Cordeiro
Instituto Politécnico de Setúbal, Setúbal, Portugal
Leszek A. Maciaszek
Wroclaw University of Economics, Wroclaw, Poland
Joaquim Filipe
Macquarie University, Sydney, Australia
José Cordeiro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ribeiro, R., Favarim, F., Barbosa, M.A.C., Koerich, A.L., Enembreck, F. (2013). Combining Learning Algorithms: An Approach to Markov Decision Processes. In: Cordeiro, J., Maciaszek, L.A., Filipe, J. (eds) Enterprise Information Systems. Lecture Notes in Business Information Processing, vol 141. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40654-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-40654-6_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40653-9
Online ISBN: 978-3-642-40654-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics