Abstract
The problem of getting interacting agents to learn simultaneously to improve their joint performance has received significant attention in the literature. One of the key challenges is to manage the system-wide effects that occur due to learning in a non-stationary environment. In this paper, we look at the impact on the system-wide dynamics and the learning convergence due to communication between the agents. Specifically, we look at the problem of learning routes between locations in a graph in the case where agents using the same edge at the same time slow each other down. We implemented and empirically examined a model where the agents simply try to model each edge in the graph as being either slow, medium, or fast due to the other agents using that edge. Communication on a fixed social network occurs only when an agent changes the speed category it has for a particular link, e.g., when it changes from believing a link is slow to believing it is medium. We find that the system dynamics are very sensitive to the ratio between the influence of direct observations on local beliefs to the influence of communicated beliefs. For some values of this ratio, convergence to good behavior can occur very quickly, but for others a brief period of good performance is followed by wild oscillations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
A. Ahmed, P. Varakantham, and S.F. Cheng. Uncertain congestion games with assorted human agent populations. 2012.
B. Anderson. Adaptive systems, lack of persistency of excitation and bursting phenomena. Automatica, 21(3):247–258, 1985.
A.L.C. Bazzan. Multi-agent systems for traffic and transportation engineering. Information Science Publishing, 2009.
A.L.C. Bazzan. Opportunities for multiagent systems and multiagent reinforcement learning in traffic control. Autonomous Agents and Multi-Agent Systems, 18(3):342–375, 2009.
N. Bhouri, S. Haciane, and F. Balbo. A multi-agent system to regulate urban traffic: Private vehicles and public transport. In Intelligent Transportation Systems (ITSC), 2010 13th International IEEE Conference on, pages 1575–1581. IEEE, 2010.
M. Bowling and M. Veloso. Multiagent learning using a variable learning rate. Artificial Intelligence, 136(2):215–250, 2002.
W. Burgard, M. Moors, D. Fox, R. Simmons, and S. Thrun. Collaborative multi-robot exploration. In Robotics and Automation, 2000. Proceedings. ICRA’00. IEEE International Conference on, volume 1, pages 476–481. IEEE, 2000.
L. Busoniu, R. Babuska, and B. De Schutter. A comprehensive survey of multiagent reinforcement learning. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 38(2):156–172, 2008.
D. De Oliveira and A.L.C. Bazzan. Multiagent learning on traffic lights control: effects of using shared information. Multi-agent systems for traffic and transportation engineering, 2009.
S. El-Tantawy and B. Abdulhai. An agent-based learning towards decentralized and coordinated traffic signal control. In Intelligent Transportation Systems (ITSC), 2010 13th International IEEE Conference on, pages 665–670. IEEE, 2010.
R.T. Glinton, P. Scerri, and K. Sycara. Towards the understanding of information dynamics in large scale networked systems. In Information Fusion, 2009. FUSION’09. 12th International Conference on, pages 794–801. IEEE, 2009.
R. Glinton, P. Scerri, and K. Sycara. Exploiting scale invariant dynamics for efficient information propagation in large teams. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1-Volume 1, pages 21–30. International Foundation for Autonomous Agents and Multiagent Systems, 2010.
J.N. Hagstrom and R.A. Abrams. Characterizing braess’s paradox for traffic networks. In Intelligent Transportation Systems, 2001. Proceedings. 2001 IEEE, pages 836–841. IEEE, 2001.
D. Hirshleifer. The Blind Leading the Blind: Social Influence, Fads, and Informational Cascades. University of California at Los Angeles, Anderson Graduate School of Management, 1993.
BA Huberman and E. Lumer. Dynamics of adaptive systems. Circuits and Systems, IEEE Transactions on, 37(4):547–550, 1990.
M. Kaisers and K. Tuyls. Frequency adjusted multi-agent q-learning. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1-Volume 1, pages 309–316. International Foundation for Autonomous Agents and Multiagent Systems, 2010.
S. Kalyanakrishnan, Y. Liu, and P. Stone. Half field offense in robocup soccer: A multiagent reinforcement learning case study. RoboCup 2006: Robot Soccer World Cup X, pages 72–85, 2007.
Y.A. Korilis, A.A. Lazar, and A. Orda. Avoiding the braess paradox in non-cooperative networks. Journal of Applied Probability, 36(1):211–222, 1999.
S. Lämmer and D. Helbing. Self-stabilizing decentralized signal control of realistic, saturated network traffic. Santa Fe Institute, 2010.
M. Nekovee, Y. Moreno, G. Bianconi, and M. Marsili. Theory of rumour spreading in complex social networks. Physica A: Statistical Mechanics and its Applications, 374(1):457–470, 2007.
R. Olfati-Saber, J.A. Fax, and R.M. Murray. Consensus and cooperation in networked multi-agent systems. Proceedings of the IEEE, 95(1):215–233, 2007.
L. Panait and S. Luke. Cooperative multi-agent learning: The state of the art. Autonomous Agents and Multi-Agent Systems, 11(3):387–434, 2005.
S. Reece, S. Roberts, A. Rogers, and N.R. Jennings. A multi-dimensional trust model for heterogeneous contract observations. In PROCEEDINGS OF THE NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, volume 22, page 128. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, 2007.
S. Russell, P. Norvig, and A. Artificial Intelligence. A modern approach. Artificial Intelligence. Prentice-Hall, Egnlewood Cliffs, 1995.
L. Tesfatsion and K.L. Judd. Handbook of computational economics: agent-based computational economics, volume 2. North Holland, 2006.
M. Vasirani and S. Ossowski. A computational market for distributed control of urban road traffic systems. Intelligent Transportation Systems, IEEE Transactions on, (99):1–9, 2011.
D.J. Watts. A simple model of global cascades on random networks. Proceedings of the National Academy of Sciences of the United States of America, 99(9):5766, 2002.
F. Xiao and L. Wang. Asynchronous consensus in continuous-time multi-agent systems with switching topology and time-varying delays. Automatic Control, IEEE Transactions on, 53(8):1804–1816, 2008.
C. Zhang and V. Lesser. Multi-agent learning with policy prediction. In Proceedings of the 24th National Conference on Artificial Intelligence (AAAI10), 2010.
C. Zhang, V. Lesser, and P. Shenoy. A multi-agent learning approach to online distributed resource allocation. In IJCAI 2009, Proceedings of the Twenty-first International Joint Conference on Artificial Intelligence, pages 361–366, 2009.
Acknowledgment
This research has been funded in part by the AFOSR MURI grant FA9550-08-1-0356.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this paper
Cite this paper
Scerri, P. (2013). Modulating Communication to Improve Multi-agent Learning Convergence. In: Sorokin, A., Pardalos, P. (eds) Dynamics of Information Systems: Algorithmic Approaches. Springer Proceedings in Mathematics & Statistics, vol 51. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7582-8_7
Download citation
DOI: https://doi.org/10.1007/978-1-4614-7582-8_7
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7581-1
Online ISBN: 978-1-4614-7582-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)