Abstract
This chapter introduces the Planning and Learning to Adapt Swiftly to Teammates to Improve Cooperation (PLASTIC) algorithms that enable an ad hoc team agent to cooperate with a variety of different teammates. One might think that the most appropriate thing for an ad hoc team agent to do is to “fit in” with its team by following the same behavior as its teammates. However, if the teammates’ behaviors are suboptimal, this approach will limit how much the ad hoc agent can help its team. Therefore, in this book, we adopt the approach of learning about different teammates and deciding how to act by leveraging this knowledge. This approach allows an ad hoc agent to reason about how well its knowledge of past teammates predicts its current teammates’ actions as well as to convert this knowledge into the actions it needs to take to accomplish its goals. If the knowledge of prior teammates accurately predicts the current teammates and the ad hoc agent is given enough time to plan, this approach will lead to optimal performance of the ad hoc agent, helping its team achieve the best possible outcome. Note that this may not be the optimal performance of any team, but it is optimal for the ad hoc agent given that the behaviors of its teammates are fixed.
This chapter contains material from three publications: [1–4]. Note that some of Sect. 5.2 is joint work with Sarit Kraus and Avi Rosenfeld in addition to Peter Stone [3].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barrett, Samuel, and Peter Stone. 2014. Cooperating with unknown teammates in robot soccer. In AAAI workshop on multiagent interaction without prior coordination (MIPC 2014), July 2014.
Barrett, Samuel, Peter Stone, and Sarit Kraus. 2011. Empirical evaluation of ad hoc teamwork in the pursuit domain. In Proceedings of the tenth international conference on autonomous agents and multiagent systems (AAMAS), May 2011.
Barrett, Samuel, Peter Stone, Sarit Kraus, and Avi Rosenfeld. 2013. Teamwork with limited knowledge of teammates. In Proceedings of the twenty-seventh conference on artificial intelligence (AAAI), July 2013.
Barrett, Samuel, and Peter Stone. 2015. Cooperating with unknown teammates in complex domains: A robot soccer case study of ad hoc teamwork. In Proceedings of the twenty-ninth conference on artificial intelligence (AAAI), January 2015.
Blum, A, and Y. Mansour. 2007. Algorithmic game theory, chapter learning, regret minimization, and equilibria. Cambridge University Press.
Silver, David, and Joel Veness. 2010. Monte-Carlo planning in large POMDPs. In Advances in neural information processing systems 23 (NIPS). 2010.
Hall, Mark, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA data mining software: An update. SIGKDD Explorations 11: 10–18.
Pardoe, David, and Peter Stone. 2010. Boosting for regression transfer. In Proceedings of the twenty-seventh international conference on machine learning (ICML), June 2010.
Yao, Yi, and G. Doretto. Boosting for transfer learning with multiple sources. In Proceedings of the conference on computer vision and pattern recognition (CVPR), June 2010.
Huang, Pipei, Gang Wang, and Shiyin Qin. 2012. Boosting for transfer learning from multiple data sources. Pattern Recognition Letters 33(5): 568–579.
Zhuang, Fuzhen, Xiaohu Cheng, SinnoJialin Pan, Yu. Wenchao, Qing He, and Zhongzhi Shi. 2014. Transfer learning with multiple sources via consensus regularized autoencoders. In Machine learning and knowledge discovery in databases, vol. 8726, ed. Toon Calders, Floriana Esposito, Eyke Hllermeier, and Rosa Meo, 417–431., Lecture notes in computer science Berlin Heidelberg: Springer.
Fang, Min, Yong Guo, Xiaosong Zhang, and Xiao Li. 2015. Multi-source transfer learning based on label shared subspace. Pattern Recognition Letters 51: 101–106.
Ge,Liang, Jing Gao, and Aidong Zhang. 2013. OMS-TL: A framework of online multiple source transfer learning. In Proceedings of the 22nd ACM international conference on information & knowledge management, CIKM ’13, 2423–2428, ACM, New York, NY, USA, 2013.
Damien Ernst, Pierre Geurts, and Louis Wehenkel. 2005. Tree-based batch mode reinforcement learning. Journal of machine learning research (JMLR), 503–556.
Christopher John Cornish Hellaby Watkins. 1989. Learning from Delayed Rewards. Ph.D thesis, King’s College, Cambridge, May 1989.
Deisenroth, Marc Peter. 2013. Gerhard Neumann, and Jan Peters. A survey on policy search for robotics. Foundations and Trends in Robotics 2(1–2): 1–142.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Barrett, S. (2015). The PLASTIC Algorithms. In: Making Friends on the Fly: Advances in Ad Hoc Teamwork. Studies in Computational Intelligence, vol 603. Springer, Cham. https://doi.org/10.1007/978-3-319-18069-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-18069-4_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18068-7
Online ISBN: 978-3-319-18069-4
eBook Packages: EngineeringEngineering (R0)