Advertisement

Adaptivity on the Robot Brain Architecture Level Using Reinforcement Learning

  • Tijn van der Zant
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7416)

Abstract

The design and implementation of a robot brain often requires making decisions between different modules with similar functionality. Many implementations and components are easy to create or can be downloaded, but it is difficult to assess which combination of modules work well and which does not. This paper discusses a reinforcement learning mechanism where the robot is choosing between the different components using empirical feedback and optimization criteria. With the interval estimation algorithm the robot deselects poorly functioning modules and retains only the best ones. A discount factor ensures that the robot keeps adapting to new circumstances in the real world. This allows the robot to adapt itself continuously on the architecture level and also allows working with large development teams creating several different implementations with similar functionalities to give the robot biggest chance to solve a task. The architecture is tested in the RoboCup@Home setting and can handle failure situations.

Keywords

adaptivity behavior selection RoboCup@Home robot brain development interval estimation algorithm reinforcement learning 

References

  1. 1.
    Asada, M., Hosoda, K., Kuniyoshi, Y., Ishiguro, H., Inui, T., Yoshikawa, Y., Ogino, M., Yoshida, C.: Cognitive developmental robotics: A survey. IEEE Transactions on Autonomous Mental Development 1(1), 12–34 (2009)CrossRefGoogle Scholar
  2. 2.
    Bellas, F., Duro, R., Faina, A., Souto, D.: Multilevel darwinist brain (mdb): Artificial evolution in a cognitive architecture for real robots. IEEE Transactions on Autonomous Mental Development 2(4), 340–354 (2010)CrossRefGoogle Scholar
  3. 3.
    van Dijk, S.G., Polani, D., Nehaniv, C.L.: Hierarchical Behaviours: Getting the Most Bang for Your Bit. In: Kampis, G., Karsai, I., Szathmáry, E. (eds.) ECAL 2009, Part II. LNCS, vol. 5778, pp. 342–349. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  4. 4.
    Gerkey, B.P., Vaughan, R.T., Howard, A.: The player/stage project: Tools for multi-robot and distributed sensor systems. In: Proceedings of the 11th International Conference on Advanced Robotics, pp. 317–323 (2003)Google Scholar
  5. 5.
    Kaelbling, L.P.: Learning in Embedded Systems. MIT Press (1993)Google Scholar
  6. 6.
    Kitano, H., Asada, M., Kuniyoshi, Y., Noda, I., Osawa, E., Matsubara, H.: RoboCup: A Challenge Problem for AI. AI Magazine 18(1), 73–85 (1997)Google Scholar
  7. 7.
    Montemerlo, M., Roy, N., Thrun, S.: Perspectives on standardization in mobile robot programming: The carnegie mellon navigation (carmen) toolkit. In: Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), pp. 2436–2441 (2003)Google Scholar
  8. 8.
    Sutton, R., Barto, A.: Reinforcement Learning: an Introduction. MIT Press (1998)Google Scholar
  9. 9.
    Vigorito, C., Barto, A.: Intrinsically motivated hierarchical skill learning in structured environments. IEEE Transactions on Autonomous Mental Development 2(2), 132–143 (2010)CrossRefGoogle Scholar
  10. 10.
    Wiering, M., Schmidhuber, J.: Efficient model-based exploration. In: Proceedings of the Sixth International Conference on Simulation of Adaptive Behavior: From Animals to Animats 6, pp. 223–228. MIT Press/Bradford Books (1998)Google Scholar
  11. 11.
    Wisspeintner, T., van der Zant, T., Iocchi, L., Schiffer, S.: RoboCupHome: Scientific Competition and Benchmarking for Domestic Service Robots. Interaction Studies 10(3), 392–426 (2009), http://dx.doi.org/10.1075/is.10.3.06wis CrossRefGoogle Scholar
  12. 12.
    der Zant, T.V., Wiering, M., Eijck, J.V.: On-line robot learning using the interval estimation algorithm. In: Proceedings of the 7th European Workshop on Reinforcement Learning, pp. 11–12 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Tijn van der Zant
    • 1
  1. 1.Artificial Intelligence Dept.University of GroningenThe Netherlands

Personalised recommendations