Modulating Communication to Improve Multi-agent Learning Convergence

Scerri, Paul

doi:10.1007/978-1-4614-7582-8_7

Paul Scerri³

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 51))

1012 Accesses

Abstract

The problem of getting interacting agents to learn simultaneously to improve their joint performance has received significant attention in the literature. One of the key challenges is to manage the system-wide effects that occur due to learning in a non-stationary environment. In this paper, we look at the impact on the system-wide dynamics and the learning convergence due to communication between the agents. Specifically, we look at the problem of learning routes between locations in a graph in the case where agents using the same edge at the same time slow each other down. We implemented and empirically examined a model where the agents simply try to model each edge in the graph as being either slow, medium, or fast due to the other agents using that edge. Communication on a fixed social network occurs only when an agent changes the speed category it has for a particular link, e.g., when it changes from believing a link is slow to believing it is medium. We find that the system dynamics are very sensitive to the ratio between the influence of direct observations on local beliefs to the influence of communicated beliefs. For some values of this ratio, convergence to good behavior can occur very quickly, but for others a brief period of good performance is followed by wild oscillations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

A. Ahmed, P. Varakantham, and S.F. Cheng. Uncertain congestion games with assorted human agent populations. 2012.
Google Scholar
B. Anderson. Adaptive systems, lack of persistency of excitation and bursting phenomena. Automatica, 21(3):247–258, 1985.
Article MATH Google Scholar
A.L.C. Bazzan. Multi-agent systems for traffic and transportation engineering. Information Science Publishing, 2009.
Google Scholar
A.L.C. Bazzan. Opportunities for multiagent systems and multiagent reinforcement learning in traffic control. Autonomous Agents and Multi-Agent Systems, 18(3):342–375, 2009.
Article Google Scholar
N. Bhouri, S. Haciane, and F. Balbo. A multi-agent system to regulate urban traffic: Private vehicles and public transport. In Intelligent Transportation Systems (ITSC), 2010 13th International IEEE Conference on, pages 1575–1581. IEEE, 2010.
Google Scholar
M. Bowling and M. Veloso. Multiagent learning using a variable learning rate. Artificial Intelligence, 136(2):215–250, 2002.
Article MathSciNet MATH Google Scholar
W. Burgard, M. Moors, D. Fox, R. Simmons, and S. Thrun. Collaborative multi-robot exploration. In Robotics and Automation, 2000. Proceedings. ICRA’00. IEEE International Conference on, volume 1, pages 476–481. IEEE, 2000.
Google Scholar
L. Busoniu, R. Babuska, and B. De Schutter. A comprehensive survey of multiagent reinforcement learning. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 38(2):156–172, 2008.
Article Google Scholar
D. De Oliveira and A.L.C. Bazzan. Multiagent learning on traffic lights control: effects of using shared information. Multi-agent systems for traffic and transportation engineering, 2009.
Google Scholar
S. El-Tantawy and B. Abdulhai. An agent-based learning towards decentralized and coordinated traffic signal control. In Intelligent Transportation Systems (ITSC), 2010 13th International IEEE Conference on, pages 665–670. IEEE, 2010.
Google Scholar
R.T. Glinton, P. Scerri, and K. Sycara. Towards the understanding of information dynamics in large scale networked systems. In Information Fusion, 2009. FUSION’09. 12th International Conference on, pages 794–801. IEEE, 2009.
Google Scholar
R. Glinton, P. Scerri, and K. Sycara. Exploiting scale invariant dynamics for efficient information propagation in large teams. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1-Volume 1, pages 21–30. International Foundation for Autonomous Agents and Multiagent Systems, 2010.
Google Scholar
J.N. Hagstrom and R.A. Abrams. Characterizing braess’s paradox for traffic networks. In Intelligent Transportation Systems, 2001. Proceedings. 2001 IEEE, pages 836–841. IEEE, 2001.
Google Scholar
D. Hirshleifer. The Blind Leading the Blind: Social Influence, Fads, and Informational Cascades. University of California at Los Angeles, Anderson Graduate School of Management, 1993.
Google Scholar
BA Huberman and E. Lumer. Dynamics of adaptive systems. Circuits and Systems, IEEE Transactions on, 37(4):547–550, 1990.
Google Scholar
M. Kaisers and K. Tuyls. Frequency adjusted multi-agent q-learning. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1-Volume 1, pages 309–316. International Foundation for Autonomous Agents and Multiagent Systems, 2010.
Google Scholar
S. Kalyanakrishnan, Y. Liu, and P. Stone. Half field offense in robocup soccer: A multiagent reinforcement learning case study. RoboCup 2006: Robot Soccer World Cup X, pages 72–85, 2007.
Google Scholar
Y.A. Korilis, A.A. Lazar, and A. Orda. Avoiding the braess paradox in non-cooperative networks. Journal of Applied Probability, 36(1):211–222, 1999.
Article MathSciNet MATH Google Scholar
S. Lämmer and D. Helbing. Self-stabilizing decentralized signal control of realistic, saturated network traffic. Santa Fe Institute, 2010.
Google Scholar
M. Nekovee, Y. Moreno, G. Bianconi, and M. Marsili. Theory of rumour spreading in complex social networks. Physica A: Statistical Mechanics and its Applications, 374(1):457–470, 2007.
Article Google Scholar
R. Olfati-Saber, J.A. Fax, and R.M. Murray. Consensus and cooperation in networked multi-agent systems. Proceedings of the IEEE, 95(1):215–233, 2007.
Article Google Scholar
L. Panait and S. Luke. Cooperative multi-agent learning: The state of the art. Autonomous Agents and Multi-Agent Systems, 11(3):387–434, 2005.
Article Google Scholar
S. Reece, S. Roberts, A. Rogers, and N.R. Jennings. A multi-dimensional trust model for heterogeneous contract observations. In PROCEEDINGS OF THE NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, volume 22, page 128. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, 2007.
Google Scholar
S. Russell, P. Norvig, and A. Artificial Intelligence. A modern approach. Artificial Intelligence. Prentice-Hall, Egnlewood Cliffs, 1995.
Google Scholar
L. Tesfatsion and K.L. Judd. Handbook of computational economics: agent-based computational economics, volume 2. North Holland, 2006.
Google Scholar
M. Vasirani and S. Ossowski. A computational market for distributed control of urban road traffic systems. Intelligent Transportation Systems, IEEE Transactions on, (99):1–9, 2011.
Google Scholar
D.J. Watts. A simple model of global cascades on random networks. Proceedings of the National Academy of Sciences of the United States of America, 99(9):5766, 2002.
Google Scholar
F. Xiao and L. Wang. Asynchronous consensus in continuous-time multi-agent systems with switching topology and time-varying delays. Automatic Control, IEEE Transactions on, 53(8):1804–1816, 2008.
Article MathSciNet Google Scholar
C. Zhang and V. Lesser. Multi-agent learning with policy prediction. In Proceedings of the 24th National Conference on Artificial Intelligence (AAAI10), 2010.
Google Scholar
C. Zhang, V. Lesser, and P. Shenoy. A multi-agent learning approach to online distributed resource allocation. In IJCAI 2009, Proceedings of the Twenty-first International Joint Conference on Artificial Intelligence, pages 361–366, 2009.
Google Scholar

Download references

Acknowledgment

This research has been funded in part by the AFOSR MURI grant FA9550-08-1-0356.

Author information

Authors and Affiliations

Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
Paul Scerri

Authors

Paul Scerri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paul Scerri .

Editor information

Editors and Affiliations

Innovative Scheduling, Inc., Gainesville, Florida, USA
Alexey Sorokin
Department of Industrial and Systems Eng, University of Florida, Gainesville, Florida, USA
Panos M. Pardalos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Scerri, P. (2013). Modulating Communication to Improve Multi-agent Learning Convergence. In: Sorokin, A., Pardalos, P. (eds) Dynamics of Information Systems: Algorithmic Approaches. Springer Proceedings in Mathematics & Statistics, vol 51. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7582-8_7

Download citation

DOI: https://doi.org/10.1007/978-1-4614-7582-8_7
Published: 16 July 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7581-1
Online ISBN: 978-1-4614-7582-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics