Allocating training instances to learning agents for team formation

Liemhetcharat, Somchaya; Veloso, Manuela

doi:10.1007/s10458-016-9355-3

Allocating training instances to learning agents for team formation

Published: 07 January 2017

Volume 31, pages 905–940, (2017)
Cite this article

Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Somchaya Liemhetcharat¹^nAff2 &
Manuela Veloso³

618 Accesses
7 Citations
Explore all metrics

Abstract

Agents can learn to improve their coordination with their teammates and increase team performance. There are finite training instances, where each training instance is an opportunity for the learning agents to improve their coordination. In this article, we focus on allocating training instances to learning agent pairs, i.e., pairs that improve coordination with each other, with the goal of team formation. Agents learn at different rates, and hence, the allocation of training instances affects the performance of the team formed. We build upon previous work on the Synergy Graph model, that is learned completely from data and represents agents’ capabilities and compatibility in a multi-agent team. We formally define the learning agents team formation problem, and compare it with the multi-armed bandit problem. We consider learning agent pairs that improve linearly and geometrically, i.e., the marginal improvement decreases by a constant factor. We contribute algorithms that allocate the training instances, and compare against algorithms from the multi-armed bandit problem. In our simulations, we demonstrate that our algorithms perform similarly to the bandit algorithms in the linear case, and outperform them in the geometric case. Further, we apply our model and algorithms to a multi-agent foraging problem, thus demonstrating the efficacy of our algorithms in general multi-agent problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Formation of Association Structures Based on Reciprocity and Their Performance in Allocation Problems

Investigating Effects of Centralized Learning Decentralized Execution on Team Coordination in the Level Based Foraging Environment as a Sequential Social Dilemma

Improved Cooperation by Balancing Exploration and Exploitation in Intertemporal Social Dilemma Tasks

Notes

More generally, we use our previous Synergy Graph algorithms to learn the capabilities and coordination of the agents completely from observations.

References

Agmon, N., & Stone, P. (2011). Leading multiple ad hoc teammates in joint action settings. In Proceedings of the international workshop on interactive decision theory and game theory (pp. 2–8).
Agmon, N., & Stone, P. (2012). Leading ad hoc agents in joint action settings with multiple teammates. In: Proceedings of the international conference on autonomous agents and multiagent systems (pp. 341–348).
Albrecht, S. V., & Ramamoorthy, S. (2012). Comparative evaluation of mal algorithms in a diverse set of ad hoc team problems. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 349–356).
Albrecht, S. V., & Ramamoorthy, S. (2013). A game-theoretic model and best-response learning method for ad hoc coordination in multiagent systems. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 1155–1156).
Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2–3), 235–256.
Article MATH Google Scholar
Barrett, S. (2014). Making friends on the fly: advances in ad hoc teamwork. Ph.D. thesis, The University of Texas at Austin.
Barrett, S., & Stone, P. (2014). Cooperating with unknown teammates in robot soccer. In Proceedings of the international workshop on multiagent interaction without prior coordination.
Barrett, S., & Stone, P. (2015). Cooperating with unknown teammates in complex domains: A robot soccer case study of ad hoc teamwork. In Proceedings of the International Conference on Artificial Intelligence (pp. 2010–2016).
Barrett, S., Stone, P., & Kraus, S. (2011). Empirical evaluation of ad hoc teamwork in the pursuit domain. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 567–574).
Barrett, S., Stone, P., Kraus, S., & Rosenfeld, A. (2012). Learning teammate models for ad hoc teamwork. In Proceedings of the international workshop on adaptive and learning agents.
Bikakis, A., & Caire, P. (2014). Computing coalitions in multiagent systems: A contextual reasoning approach. In Proceedings of the European conference on multi-agent systems (pp. 85–100).
Blumenthal, H. J., & Parker, G. (2004). Co-evolving team capture strategies for dissimilar robots. In Proceedings of the 2004 AAAI fall symposium (Vol. 2).
Buşoniu, L., Babuška, R., & De Schutter, B. (2010). Multi-agent reinforcement learning: An overview. In D. Srinivasan & L. C. Jain (Eds.), Innovations in multi-agent systems and applications-1 (pp. 183–221). Berlin: Springer.
Google Scholar
Caire, P., & Bikakis, A. (2014). A MCS-based methodology for computing coalitions in multirobot systems. In Proceedings of the international workshop on cognitive robotics.
Chakraborty, D. (2014). Sample efficient multiagent learning in the presence of markovian agents. Ph.D. thesis, The University of Texas at Austin.
Chakraborty, D., & Stone, P. (2008). Online multiagent learning against memory bounded adversaries. In Proceedings of the joint European conference on machine learning and knowledge discovery in databases (pp. 211–226).
Chakraborty, D., & Stone, P. (2013). Cooperating with a markovian ad hoc teammate. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 1085–1092).
Chang, Y., Ho, T., & Kaelbling, L. P. (2004). All Learning is local: Multi-agent learning in global reward games. In Proceeedings of the international conference on neural information processing systems (pp. 807–814).
Crandall, J. W. (2014). Non-myopic learning in repeated stochastic games. arXiv preprint arXiv:1409.8498
Eck, A., & Soh, L. K. (2015). To ask, sense, or share: Ad hoc information gathering. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 367–376).
Feldman, Z., & Domshlak, C. (2014). Simple regret optimization in online planning for markov decision processes. Journal of Artificial Intelligence Research, 51, 165–205.
MathSciNet MATH Google Scholar
Genter, K., Agmon, N., & Stone, P. (2013). Ad hoc teamwork for leading a flock. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 531–538).
Gordin, M., Sen, S., & Puppala, N. (1997). Evolving cooperative groups: Preliminary results. In International workshop on multi-agent learning.
Jumadinova, J., Dasgupta, P., & Soh, L. K. (2014). Strategic capability-learning for improved multiagent collaboration in ad hoc environments. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 44(8), 1003–1014.
Article Google Scholar
Liemhetcharat, S. (2013). Representation, planning and learning of dynamic ad hoc robot teams. Ph.D. thesis, Carnegie Mellon University.
Liemhetcharat, S., & Luo, Y. (2015). Adversarial synergy graph model for predicting game outcomes in human basketball. In Proceedings of the international workshop on adaptive and learning agents.
Liemhetcharat, S., & Luo, Y. (2015). Applying the synergy graph model to human basketball (extended abstract). In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 1695–1696).
Liemhetcharat, S., & Veloso, M. (2012). Modeling and learning synergy for team formation with heterogeneous agents. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 365–375).
Liemhetcharat, S., & Veloso, M. (2012). Weighted synergy graphs for role assignment in ad hoc heterogeneous robot teams. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 5247–5254).
Liemhetcharat, S., & Veloso, M. (2013). Forming an effective multi-robot team robust to failures. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 5240–5245).
Liemhetcharat, S., & Veloso, M. (2013). Learning the synergy of a new teammate. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 5246–5251).
Liemhetcharat, S., & Veloso, M. (2013). Synergy graphs for configuring robot team members. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 111–118).
Liemhetcharat, S., & Veloso, M. (2014). Weighted synergy graphs for effective team formation with heterogeneous ad hoc agents. Artificial Intelligence, 208(2014), 41–65.
Article MathSciNet MATH Google Scholar
Liemhetcharat, S., Yan, R., & Tee, K. (2015). Continuous foraging and information gathering in a multi-agent team. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 1325–1333).
Liemhetcharat, S., Yan, R., Tee, K., & Lee, M. (2015). Multi-robot item delivery and foraging: Two Sides of a Coin. Robotics, Special Issue on Recent Advances in Multi-Robot Systems: Algorithms, and Applications, 4(3), 365–397.
Google Scholar
Melo, F. S., & Sardinha, A. (2016). Ad hoc teamwork by learning teammates’ task. Journal of Autonomous Agents and Multi-Agent Systems, 30(2), 175–219.
Article Google Scholar
Panait, L., & Luke, S. (2005). Cooperative multi-agent learning: The state of the art. Journal of Autonomous Agents and Multi-Agent Systems, 11(3), 387–434.
Article Google Scholar
Puppala, N., Sen, S., & Gordin, M. (1998). Shared memory based cooperative coevolution. In Proceedings of the IEEE international conference on evolutionary computation (pp. 570–574).
Russell, S., & Norvig, P. (2003). AI: A modern approach. Upper Saddle Rive: Prentice Hall.
MATH Google Scholar
Stone, P., Kaminka, G., Kraus, S., & Rosenschein, J. (2010). Ad hoc autonomous agent teams: Collaboration without pre-coordination. In: Proceedings of the international conference on artificial intelligence (pp. 1504–1509).
Stone, P., & Kraus, S. (2010). To teach or not to teach? Decision making under uncertainty in ad hoc teams. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 117–124).
Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the international conference on machine learning (pp. 330–337).
Thompson, W. (1933). On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3/4), 285–294.
Article MATH Google Scholar
Tuyls, K., & Nowe, A. (2005). Evolutionary game theory and multi-agent reinforcement learning. Journal of Knowledge Engineering Review, 20(1), 65–90.
Google Scholar
Wang, C., Liemhetcharat, S., & Low, K. (2016). Multi-agent continuous transportation with online balanced partitioning (extended abstract). In: Proceedings of the international conference on autonomous agents and multiagent systems (pp. 1303–1304).
Wu, F., Zilberstein, S., & Chen, X. (2011). Online planning for ad hoc autonomous agent teams. In Proceedings of the international joint conference on artificial intelligence (pp. 439–445).
Zihayat, M., Kargar, M., & An, A. (2014). Two-phase pareto set discovery for team formation in social networks. In Proceedings of the IEEE/WIC/ACM international joint conferences on web intelligence (WI) and intelligent agent technologies (IAT) (pp. 304–311).

Download references

Acknowledgements

The authors thank Junyun Tay for her feedback and comments, and Yucheng Low for his help with the statistical significance tests. This work was partially supported by the Air Force Research Laboratory under grant number FA87501020165, by the Office of Naval Research under grant number N00014-09-1-1031, and the Agency for Science, Technology, and Research (A*STAR), Singapore. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity. This work was supported by the A*STAR Computational Resource Centre through the use of its high performance computing facilities.

Author information

Somchaya Liemhetcharat
Present address: Uber Advanced Technologies Center, Pittsburgh, PA, USA

Authors and Affiliations

Institute for Infocomm Research, A*STAR, Singapore, Singapore
Somchaya Liemhetcharat
Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA
Manuela Veloso

Authors

Somchaya Liemhetcharat
View author publications
You can also search for this author in PubMed Google Scholar
Manuela Veloso
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Somchaya Liemhetcharat.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liemhetcharat, S., Veloso, M. Allocating training instances to learning agents for team formation. Auton Agent Multi-Agent Syst 31, 905–940 (2017). https://doi.org/10.1007/s10458-016-9355-3

Download citation

Published: 07 January 2017
Issue Date: July 2017
DOI: https://doi.org/10.1007/s10458-016-9355-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Allocating training instances to learning agents for team formation

Abstract

Access this article

Similar content being viewed by others

Formation of Association Structures Based on Reciprocity and Their Performance in Allocation Problems

Investigating Effects of Centralized Learning Decentralized Execution on Team Coordination in the Level Based Foraging Environment as a Sequential Social Dilemma

Improved Cooperation by Balancing Exploration and Exploitation in Intertemporal Social Dilemma Tasks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Allocating training instances to learning agents for team formation

Abstract

Access this article

Similar content being viewed by others

Formation of Association Structures Based on Reciprocity and Their Performance in Allocation Problems

Investigating Effects of Centralized Learning Decentralized Execution on Team Coordination in the Level Based Foraging Environment as a Sequential Social Dilemma

Improved Cooperation by Balancing Exploration and Exploitation in Intertemporal Social Dilemma Tasks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation