Strategy-Based Learning through Communication with Humans

  • Nguyen-Thinh Le
  • Niels Pinkwart
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7327)


In complex application systems, there are typically not only autonomous components which can be represented by agents, but humans may also play a role. The interaction between agents and humans can be learned to enhance the stability of a system. How can agents adopt strategies of humans to solve conflict situations? In this paper, we present a learning algorithm for agents based on interactions with humans in conflict situations. The learning algorithm consists of four phases: 1) agents detect a conflict situation, 2) a conversation takes place between a human and agents, 3) agents involved in a conflict situation evaluate the strategy applied by the human, and 4) agents which have interacted with humans apply the best rated strategy in a similar conflict situation. We have evaluated this learning algorithm using a Jade/Repast simulation framework. An evaluation study shows two benefits of the learning algorithm. First, through interaction with humans, agents can handle conflict situations, and thus, the system becomes more stable. Second, agents adopt the problem solving strategy which has been applied most frequently by humans.


Agent-Human learning multi-agent systems machine learning evaluation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A Survey of Robot Learning From Demonstration. Robotics and Autonomous Systems 57(5), 469–483 (2009)CrossRefGoogle Scholar
  2. 2.
    Görmer, J., Homoceanu, G., Mumme, C., Huhn, M., Müller, J.P.: JRep: Extending Repast Simphony for Jade Agent Behavior Components. In: Proceedings of the IEEE/WIC/ACM Int. Conf. on Intelligent Agent Technology, pp. 149–154 (2011)Google Scholar
  3. 3.
    Isbell, C., Kearns, M., Singh, S., Shelton, C., Stone, P., Kormann, D.: Cobot in LambdaMOO: A Social Statistics Agent. Autonomous Agents and Multiagent Systems 13(3), 327–354 (2006)CrossRefGoogle Scholar
  4. 4.
    Knox, W.B., Stone, P.: Interactively Shaping Agents via Human Reinforcement - The TAMER Framework. In: Proceedings of the 15th International Conference on Knowledge Capture, pp. 9–16. ACM, New York (2009)CrossRefGoogle Scholar
  5. 5.
    Knox, W.B., Stone, P.: Combining manual feedback with subsequent MDP reward signals for reinforcement learning. In: Proceedings of the 9th Int. Conference on Autonomous Agents and Multiagent Systems, vol. 1, pp. 5–12. AAMAS (2010)Google Scholar
  6. 6.
    Kuhlmann, G., Stone, P., Mooney, R.J., Shavlik, J.W.: Guiding a Reinforcement Learner With Natural Language Advice: Initial Results in RoboCup Soccer. In: Proceedings of the AAAI Workshop on Supervisory Control of Learning and Adaptive Systems (2004)Google Scholar
  7. 7.
    Le, N.T., Menzel, W., Pinkwart, N.: Considering Ill-definedness of Problems From The Aspect of Solution Space. In: Proceedings of the 23rd International Florida Artificial Intelligence Conference (FLAIRS), pp. 534–535. AAAI Press (2010)Google Scholar
  8. 8.
    Le, N.T., Märtin, L., Pinkwart, N.: Learning Capabilities of Agents in Social Systems. In: Proceedings of The 1st International Workshop on Issues and Challenges in Social Computing (WICSOC), held at the IEEE International Conference on Information Reuse and Integration (IRI), pp. 539–544. IEEE, NJ (2011)Google Scholar
  9. 9.
    Moreno, D.L., Regueiro, C.V., Iglesias, R., Barro, S.: Using Prior Knowledge to Improve Reinforcement Learning in Mobile Robotics. In: Proceedings of Towards Autonomous Robotic Systems (TAROS), Technical Report Series, Report Number CSM-415, Department of Computer Science, University of Essex (2004)Google Scholar
  10. 10.
    Ng, A.Y., Kim, H.J., Jordan, M.I., Sastry, S.: Inverted Autonomous Helicopter Flight Via Reinforcement Learning. In: International Symposium on Experimental Robotics. MIT Press (2004)Google Scholar
  11. 11.
    Panait, L., Luke, S.: Cooperative Multi-agent Learning: The State of the Art. Autonomous Agents and Multi-Agent Systems 11(3), 387–434 (2005)CrossRefGoogle Scholar
  12. 12.
    Saggar, M., D’Silva, T., Kohl, N., Stone, P.: Autonomous Learning of Stable Quadruped Locomotion. In: Lakemeyer, G., Sklar, E., Sorrenti, D.G., Takahashi, T. (eds.) RoboCup 2006: Robot Soccer World Cup X. LNCS (LNAI), vol. 4434, pp. 98–109. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  13. 13.
    Schneider, J., Wong, W.K., Moore, A., Riedmiller, M.: Distributed Value Functions. In: Proceedings of the 16th International Conference on Machine Learning, pp. 371–378. Morgan Kaufmann (1999)Google Scholar
  14. 14.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (1998)Google Scholar
  15. 15.
    Taylor, M.E., Suay, H.B., Chernova, S.: Integrating Reinforcement Learning with Human Demonstrations of Varying Ability. In: Proceedings of the 10th Int. Conference on Autonomous Agents and Multiagent Systems, pp. 617–624. AAMAS (2011)Google Scholar
  16. 16.
    Thawonmas, R., Hirayama, J.-I., Takeda, F.: Learning from Human Decision-Making Behaviors - An Application to RoboCup Software Agents. In: Hendtlass, T., Ali, M. (eds.) IEA/AIE 2002. LNCS (LNAI), vol. 2358, pp. 136–145. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  17. 17.
    Weiß, G., Dillenbourg, P.: What is ’multi’ in Multi-agent Learning. In: Dillenbourg (ed.) Collaborative-learning: Cognitive, pp. 64–80. Pergamon Press, Oxford (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Nguyen-Thinh Le
    • 1
  • Niels Pinkwart
    • 1
  1. 1.Clausthal University of TechnologyGermany

Personalised recommendations