Imitative Reinforcement Learning for Soccer Playing Robots

  • Tobias Latzke
  • Sven Behnke
  • Maren Bennewitz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4434)


In this paper, we apply Reinforcement Learning (RL) to a real-world task. While complex problems have been solved by RL in simulated worlds, the costs of obtaining enough training examples often prohibits the use of plain RL in real-world scenarios. We propose three approaches to reduce training expenses for real-world RL. Firstly, we replace the random exploration of the huge search space, which plain RL uses, by guided exploration that imitates a teacher. Secondly, we use experiences not only once but store and reuse them later on when their value is easier to assess. Finally, we utilize function approximators in order to represent the experience in a way that balances between generalization and discrimination. We evaluate the performance of the combined extensions of plain RL using a humanoid robot in the RoboCup soccer domain. As we show in simulation and real-world experiments, our approach enables the robot to quickly learn fundamental soccer skills.


Reinforcement Learning Humanoid Robot Neural Information Processing System Real Robot Learning Agent 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd edn. Prentice-Hall, Englewood Cliffs, NJ (2003)Google Scholar
  2. 2.
    Tesauro, G.: Practical issues in temporal difference learning. In: Proc. of Conference on Advances in Neural Information Processing Systems, vol. 4, pp. 259–266. Morgan Kaufmann Publishers, San Francisco (1992)Google Scholar
  3. 3.
    Riedmiller, M., Merke, A., Nowak, W., Nickschas, M., Withopf, D.: Brainstormers 2003 - team description. In: Polani, D., Browning, B., Bonarini, A., Yoshida, K. (eds.) RoboCup 2003. LNCS (LNAI), vol. 3020, Springer, Heidelberg (2004)Google Scholar
  4. 4.
    Asada, M., Ogino, M., Matsuyama, S., Ooga, J.: Imitation learning based on visuo-somatic mapping. In: ISER. Proc. of International Symposium on Experimental Robotics (2004)Google Scholar
  5. 5.
    Bentivegna, D.C., Atkeson, C.G., Cheng, G.: Learning tasks from observation and practice. Journal of Robotics & Autonomous Systems 47(2-3), 163–169 (2004)CrossRefGoogle Scholar
  6. 6.
    Dillmann, R.: Teaching and learning of robot tasks via observation of human performance. Journal of Robotics & Autonomous Systems 47(2-3), 109–116 (2004)CrossRefGoogle Scholar
  7. 7.
    Ito, M., Tani, J.: Joint attention between a humanoid robot and users in imitation game. In: ICDL. Proc. of the Int. Conf. on Development and Learning (2004)Google Scholar
  8. 8.
    Mataric, M.J.: Sensory-motor primitives as a basis for imitation: Linking perception to action and biology to robotics. In: Dautenhahn, K., Nehaniv, C. (eds.) Imitation in Animals and Artifacts, MIT Press, Cambridge (2002)Google Scholar
  9. 9.
    Schaal, S.: Learning from demonstration. In: Proc. of the Conf. on Neural Information Processing Systems (NIPS) (1997)Google Scholar
  10. 10.
    Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine Learning 3, 9–44 (1988)Google Scholar
  11. 11.
    Hinton, G.E.: Distributed representations. Technical Report CMU-CS-84-157, Carnegie-Mellon University, Computer Science Department, Pittsburgh, PA (1984)Google Scholar
  12. 12.
    Peng, J., Williams, R.J.: Incremental multi-step Q-learning. In: Proceedings of the 11th International Conference on Machine Learning, pp. 226–232 (1994)Google Scholar
  13. 13.
    Jaakkola, T., Jordan, M.I., Singh, S.P.: Convergence of stochastic iterative dynamic programming algorithms. In: Cowan, J.D., Tesauro, G., Alspector, J. (eds.) Proc. of 7th Conference on Advances in Neural Information Processing Systems, pp. 703–710. Morgan Kaufmann, San Francisco (1994)Google Scholar
  14. 14.
    Behnke, S., Müller, J., Schreiber, M.: Playing soccer with RoboSapien. In: Bredenfeld, A., Jacoff, A., Noda, I., Takahashi, Y. (eds.) RoboCup 2005. LNCS (LNAI), vol. 4020, Springer, Heidelberg (2006)Google Scholar
  15. 15.
    Aloul, F.A., Markov, I.L., Sakallah, K.A.: Efficient symmetry breaking for Boolean satisfiability. In: International Joint Conference on Artificial Intelligence, vol. 3, pp. 271–282. AAAI, Stanford (2003)Google Scholar
  16. 16.
    Withopf, D., Riedmiller, M.: Effective methods for reinforcement learning in large multi-agent domains. Information Technology Journal 47(5) (2005)Google Scholar
  17. 17.
    McCallum, A.: Learning to use selective attention and short-term memory in sequential tasks. In: Maes, P., Matari, M., Meyer, J.A., Pollack, J., Wilson, S. (eds.) From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, Berlin, pp. 315–324. MIT Press, Cambridge (1996)Google Scholar
  18. 18.
    Wiering, M., Schmidhuber, J.: HQ-learning. Adaptive Behavior 6(2), 219–246 (1997)CrossRefGoogle Scholar
  19. 19.
    Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. Journal of Machine Learning Research 6, 503–556 (2005)MathSciNetGoogle Scholar
  20. 20.
    Storck, J., Hochreiter, J., Schmidhuber, J.: Reinforcement driven information acquisition in non-deterministic environments. In: Proc. of ICANN 1995. vol. 2., Paris, pp. 159–164 (1995)Google Scholar
  21. 21.
    Maclin, R., Shavlik, J.W.: Incorporating advice into agents that learn from reinforcements. In: Proc. of 12th National Conference on Artificial Intelligence, pp. 694–699 (1994)Google Scholar
  22. 22.
    Demiris, J., Hayes, G.: A robot controller using learning by imitation. In: Proceedings of the 2nd International Symposium on Intelligent Robotic Systems, Grenoble, France (1994)Google Scholar
  23. 23.
    Riedmiller, M., Merke, A., Meier, D., Hoffmann, A., Sinner, A., Thate, O., Ehrmann, R.: Karlsruhe Brainstormers — A reinforcement learning approach to robotic soccer. Lecture Notes in Computer Science (2001)Google Scholar
  24. 24.
    Dietl, M.: Reinforcement-Lernen im Roboterfußball. Diplomarbeit (in German), Albert-Ludwigs-Universität Freiburg (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Tobias Latzke
    • 1
  • Sven Behnke
    • 1
  • Maren Bennewitz
    • 1
  1. 1.University of Freiburg, Computer Science Institute, D-79110 FreiburgGermany

Personalised recommendations