Artificial Life and Robotics

, Volume 21, Issue 1, pp 11–17 | Cite as

Reinforcement learning in dynamic environment: abstraction of state-action space utilizing properties of the robot body and environment

Original Article


In this paper, we address the autonomous control of a 3D snake-like robot through the use of reinforcement learning, and we apply it in a dynamic environment. In general, snake-like robots have high mobility that is realized by many degrees of freedom, and they can move over dynamically shifting environments such as rubble. However, this freedom and flexibility leads to a state explosion problem, and the complexity of the dynamic environment leads to incomplete learning by the robot. To solve these problems, we focus on the properties of the actual operating environment and the dynamics of a mechanical body. We design the body of the robot so that it can abstract small, but necessary state-action space by utilizing these properties, and we make it possible to apply reinforcement learning. To demonstrate the effectiveness of the proposed snake-like robot, we conduct experiments; from the experimental results we conclude that learning is completed within a reasonable time, and that effective behaviors for the robot to adapt itself to an unknown 3D dynamic environment were realized.


Dynamic environment Reinforcement learning Crawler robot Snake-like robot 



This study was partially supported by the Ministry of Education, Culture, Sports, Science and Technology (MEXT) (Grant-in-Aid for Young Scientists (B), 22700156, 2011).


  1. 1.
    Arai M, Takayama T, Hirose S (2004), Development of Souryu-III: Connected crawler vehicle for inspection inside narrow and winding spaces, Proc. Int. Conf. Intelligent Robots and Systems, p 52–57Google Scholar
  2. 2.
    Paap KL, Christaller T, Kirchner F (2000) A robot snake to inspect broken buildings, Proc. Int. Conf. Intelligent Robots and Systems, p 2079–2082Google Scholar
  3. 3.
    Wolf A, Brown HB, Casciola R et al (2003) A mobile hyper redundant mechanism for search and rescue tasks, Proc. Int. Conf. Intelligent Robots and Systems, p 2889–2895Google Scholar
  4. 4.
    Kamegawa T, Yamasaki T, Igarashi H et al (2004) Development of the snake-like rescue robot “KOHGA,” Proc. 2004 IEEE Int. Conf. on Robotics and Automation, p 5081–5086Google Scholar
  5. 5.
    Yamada H, Mori M, Hirose S (2007) Stabilization of the head of an undulating snake-like robot, Proc. Int. Conf. Intelligent Robots and Systems, p 3566–3571Google Scholar
  6. 6.
    Ito K, Fukumori Y (2006) Autonomous control of a snake-like robot utilizing passive mechanism, Proc. 2006 IEEE Int. Conf. Robotics and Automation, p 381–386Google Scholar
  7. 7.
    Ito K, Kamegawa T, Matsuno F (2003) Extended QDSEGA for controlling real robots -Acquisition of locomotion patterns for snake-like robot-, Proc. 2003 IEEE Int. Conf. Robotics and Automation, Sep 14–19, p 791–796Google Scholar
  8. 8.
    Murai R, Ito K, Matsuno F (2006) An intuitive human-robot interface for rescue operation of a 3D snake robot, Proc. 12th IASTED Int. Conf. Robotics and Applications p138–143Google Scholar
  9. 9.
    Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8:279–292MATHGoogle Scholar
  10. 10.
    Kober J et al (2013) Reinforcement learning in robotics: a survey. Int J Robot Res 32(11):1238–1274CrossRefGoogle Scholar
  11. 11.
    Kimura H, Yamashita T, Kobaysahi S (2001) Reinforcement learning of walking behavior for a four-legged robot, Proc. 40th IEEE Conf. Decision and Control, p 411–416Google Scholar
  12. 12.
    Pfeifer R (2001) “Understand Intelligence,” The MIT Press, New editionGoogle Scholar
  13. 13.
    Gibson JJ (1987) The ecological approach to visual perception. Hillsdale, NJ, Lawrence Erlbaum AssociatesGoogle Scholar
  14. 14.
    Ito K, Takayama A, Kobayashi T (2009) “Hardware design of autonomous snake-like robot for reinforcement learning based on environment -Discussion of versatility on different tasks-,” The 2009 IEEE/RSJ Int. Conf. Intelligent Robots and Systems, p 2622–2627Google Scholar
  15. 15.
    Ito K, Fukumori Y, Takayama A (2007) Autonomous control of real snake-like robot using reinforcement learning -abstraction of state-action space using properties of real world-, Proc. Int. Conf. Intelligent Sensors, Sensor Networks and Information Processing, p 389–394Google Scholar

Copyright information

© ISAROB 2016

Authors and Affiliations

  1. 1.Hosei UniversityTokyoJapan
  2. 2.Murata Manufacturing Co., Ltd.NagaokakyoJapan

Personalised recommendations