Advertisement

Autonomous Robots

, Volume 43, Issue 1, pp 79–95 | Cite as

Learning to exploit passive compliance for energy-efficient gait generation on a compliant humanoid

  • Petar Kormushev
  • Barkan UgurluEmail author
  • Darwin G. Caldwell
  • Nikos G. Tsagarakis
Article
  • 400 Downloads

Abstract

Modern humanoid robots include not only active compliance but also passive compliance. Apart from improved safety and dependability, availability of passive elements, such as springs, opens up new possibilities for improving the energy efficiency. With this in mind, this paper addresses the challenging open problem of exploiting the passive compliance for the purpose of energy efficient humanoid walking. To this end, we develop a method comprising two parts: an optimization part that finds an optimal vertical center-of-mass trajectory, and a walking pattern generator part that uses this trajectory to produce a dynamically-balanced gait. For the optimization part, we propose a reinforcement learning approach that dynamically evolves the policy parametrization during the learning process. By gradually increasing the representational power of the policy parametrization, it manages to find better policies in a faster and computationally efficient way. For the walking generator part, we develop a variable-center-of-mass-height ZMP-based bipedal walking pattern generator. The method is tested in real-world experiments with the bipedal robot COMAN and achieves a significant 18% reduction in the electric energy consumption by learning to efficiently use the passive compliance of the robot.

Keywords

Bipedal walking Energy efficiency Reinforcement learning Passive compliance 

Notes

Acknowledgements

This work was partially supported by the EU project AMARSi, under the contract FP7-ICT-248311.

Supplementary material

Supplementary material 1 (mp4 28136 KB)

References

  1. Abdolmaleki, A., Lau, N., Reis, L. P., Peters, J., & Neumann, G. (2016). Contextual policy search for linear and nonlinear generalization of a humanoid walking controller. Journal of Intelligent and Robotic Systems, 83(3), 393–408.CrossRefGoogle Scholar
  2. Amran, C. A., Ugurlu, B., & Kawamura, A. (2010). Energy and torque efficient ZMP-based bipedal walking with varying center of mass height. In Proceedings of the IEEE international workshop on advanced motion control (pp. 408–413). Nagaoka, Japan.Google Scholar
  3. Bernstein, A., & Shimkin, N. (2010). Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains. Machine Learning, 81(3), 359–397.MathSciNetCrossRefGoogle Scholar
  4. Calandra, R., Seyfarth, A., Peters, J., & Deisenroth, M. P. (2014). An experimental comparison of bayesian optimization for bipedal locomotion. In Proceedings of 2014 IEEE international conference on robotics and automation (ICRA), Hong Kong.Google Scholar
  5. Carpentier, J., Tonneau, S., Naveau, M., Stasse, O., & Mansard, N. (2016). A versatile and efficient pattern generator for generalized legged locomotion. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (pp. 1–6). Stockholm, Sweden.Google Scholar
  6. Choi, Y., Kim, D., Oh, Y., & You, B. (2007). Posture/walking control for humanoid robot based on resolution of CoM Jacobian with embedded motion. IEEE Transactions on Robotics, 23(6), 1285–1293.CrossRefGoogle Scholar
  7. Coates, A., Abbeel, P., & Ng, A. Y. (2009). Apprenticeship learning for helicopter control. Communications of the ACM, 52(7), 97–105.CrossRefGoogle Scholar
  8. Deisenroth, M. P., Calandra, R., Seyfarth, A., & Peters, J. (2012). Toward fast policy search for learning legged locomotion. In 2012 IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 1787–1792). Algarve, Portugal: IEEE.Google Scholar
  9. Geyer, H., Seyfarth, A., & Blickhan, R. (2006). Compliant leg behaviour explains basic dynamics of walking and running. Proceedings of the Royal Society B: Biological Sciences, 273(1603), 2861–2867.CrossRefGoogle Scholar
  10. Guenter, F., Hersch, M., Calinon, S., & Billard, A. (2007). Reinforcement learning for imitating constrained reaching movements. Advanced Robotics, 21(13), 1521–1544.Google Scholar
  11. Harada, K., Kajita, S., Kaneko, K., & Hirukawa, H. (2004). An analytical method on real-time gait planning for a humanoid robot. International Journal of Humanoid Robotics, 3(1), 1–19.CrossRefGoogle Scholar
  12. Herzog, A., Schaal, S., & Righetti, L. (2016). Structured contact force optimization for kino-dynamic motion generation. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), Daejeon, Korea (pp. 1–6).Google Scholar
  13. Hu, Y., Felis, M., & Mombaur, K. (2014). Compliance analysis of human leg joints in level ground walking with an optimal control approach. In Proceedings of the IEEE international conference on humanoid robots (humanoids), Madrid, Spain (pp. 881–886).Google Scholar
  14. Ishikawa, M., Komi, P. V., Grey, M. J., Lepola, V., & Bruggemann, P. G. (2005). Muscle-tendon interaction and elastic energy usage in human walking. The Journal of Applied Physiology, 99(2), 603–608.CrossRefGoogle Scholar
  15. Jafari, A., Tsagarakis, N. G., & Caldwell, D. G. (2013). A novel intrinsically energy efficient actuator with adjustable stiffness (AwAS). IEEE/ASME Transactions on Mechatronics, 18(1), 355–365.CrossRefGoogle Scholar
  16. Kagami, S., Kitagawa, T., Nishiwaki, K., Sugihara, T., Inaba, T., & Inoue, H. (2002). A fast dynamically equilibrated walking trajectory generation method of humanoid robot. Autonomous Robots, 2(1), 71–82.CrossRefzbMATHGoogle Scholar
  17. Kajita, S., Kanehiro, F., Kaneko, K., Fujiwara, K., Yokoi, K., & Hirukawa, H. (2003). Biped walking pattern generation by using preview control. In Proceedings of the IEEE international conference on robotics and automation (ICRA), Taipei, Taiwan (pp. 1620–1626).Google Scholar
  18. Kober, J., & Peters, J. (2009). Learning motor primitives for robotics. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (pp. 2112–2118). Kobe, Japan.Google Scholar
  19. Kober, J., & Peters, J. (2011). Policy search for motor primitives in robotics. Machine Learning, 84(1–2), 171–203.MathSciNetCrossRefzbMATHGoogle Scholar
  20. Koch, K.H., Clever, D., Mombaur, K., & Endres, D. (2015). Learning movement primitives from optimal and dynamically feasible trajectories for humanoid walking. In Proceedings IEEE-Ras Intl Conf. on Humanoid Robots (Humanoids) (pp. 866–873). Seoul, Korea.Google Scholar
  21. Kohl, N., & Stone, P. (2004). Machine learning for fast quadrupedal locomotion. In Proceedings National Conference on Artificial Intelligence, pages 611–616. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999.Google Scholar
  22. Kormushev, P., Calinon, S., & Caldwell, D. G. (2010). Robot motor skill coordination with EM-based reinforcement learning. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), Taipei, Taiwan (pp. 3232–3237).Google Scholar
  23. Kormushev, P., Nenchev, D. N., Calinon, S., & Caldwell, D. G. (2011a). Upper-body kinesthetic teaching of a free-standing humanoid robot. In Proceedings of the IEEE international conference on robotics and automation (ICRA), Shanghai, China.Google Scholar
  24. Kormushev, P., Ugurlu, B., Calinon, S., Tsagarakis, N. G., & Caldwell, D. G. (2011b). Bipedal walking energy minimization by reinforcement learning with evolving policy parameterization. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems(IROS), San Francisco, USA (pp. 318–324).Google Scholar
  25. Liu, Q., Zhao, J., Schutz, S., & Berns, K. (2015). Adaptive motor patterns and reflexes for bipedal locomotion on rough terrain. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems, Hamburg, Germany (pp. 3856–3861).Google Scholar
  26. McGeer, T. (1990). Passive dynamic walking. International Journal of Robotics Research, 9(2), 62–82.CrossRefGoogle Scholar
  27. Minekata, H., Seki, H., & Tadakuma, S. (2008). A study of energy-saving shoes for robot considering lateral plane motion. IEEE Transactions on Industrial Electronics, 55(3), 1271–1276.CrossRefGoogle Scholar
  28. Miyamoto, H., Morimoto, J., Doya, K., & Kawato, M. (2004). Reinforcement learning with via-point representation. Neural Networks, 17, 299–305.CrossRefzbMATHGoogle Scholar
  29. Moore, A. W., & Atkeson, C. G. (1995). The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces. Machine Learning, 21, 199–233.Google Scholar
  30. Morimoto, J., & Atkeson, C. G. (2007). Learning biped locomotion: Application of poincare-map-based reinforcement learning. IEEE Robotics and Automation Magazine, 14(2), 41–51.CrossRefGoogle Scholar
  31. Orin, D. E., Goswami, A., & Lee, S.-H. (2013). Centroidal dynamics of a humanoid robot. Autonomous Robots, 35(2), 161–176.CrossRefGoogle Scholar
  32. Ortega, J. D., & Farley, C. T. (2005). Minimizing center of mass vertical movement increases metabolic cost in walking. The Journal of Applied Physiology, 581(9), 2099–2107.CrossRefGoogle Scholar
  33. Pastor, P., Kalakrishnan, M., Chitta, S., Theodorou, E., & Schaal, S. (2011). Skill learning and task outcome prediction for manipulation. In International conference on robotics and automation (ICRA), Shanghai, China.Google Scholar
  34. Peters, J., & Schaal, S. (2006). Policy gradient methods for robotics. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), Beijing, China.Google Scholar
  35. Peters, J., & Schaal, S. (2008a). Natural actor-critic. Neurocomputing, 71(7–9), 1180–1190.Google Scholar
  36. Peters, J., & Schaal, S. (2008b). Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4), 682–697.Google Scholar
  37. Rosado, J., Silva, F., & Santos, V. (2015). Biped walking learning from imitation using dynamic movement primitives. In L. P. Reis, A. P. Moreira, P. U. Lima, L. Montano, & V. Munoz Martinez (Eds.), Advances in intelligent systems and computing (pp. 185–196). Switzerland: Springer International Publishing.Google Scholar
  38. Rosenstein, M. T., Barto, A. G., & Van Emmerik, R. E. A. (2006). Learning at the level of synergies for a robot weightlifter. Robotics and Autonomous Systems, 54(8), 706–717.CrossRefGoogle Scholar
  39. Schaal, S., Ijspeert, A., & Billard, A. (2003). Computational approaches to motor learning by imitation. Philosophical Transaction of the Royal Society of London: Series B, Biological Sciences, 358(1431), 537–547.CrossRefGoogle Scholar
  40. Shafii, N., Lau, N., & Reis, L. P. (2015). Learning to walk fast: Optimized hip height movement for simulated and real humanoid robots. Journal of Intelligent and Robotic Systems, 80(3), 555–571.CrossRefGoogle Scholar
  41. Shen, H., Yosinski, J., Kormushev, P., Caldwell, D. G., & Lipson, H. (2012). Learning fast quadruped robot gaits with the rl power spline parameterization. Bulgarian Academy of Sciences, Cybernetics and Information Technologies, 12(3), 66–75.CrossRefGoogle Scholar
  42. Stulp, F., Buchli, J., Theodorou, E., & Schaal, S. (2010). Reinforcement learning of full-body humanoid motor skills. In Proceedings of the IEEE international conference on humanoid robots, Nashville, TN, USA (pp. 405–410).Google Scholar
  43. Sugihara, T., & Nakamura, Y. (2009). Boundary condition relaxation method for stepwise pedipulation planning of biped robot. IEEE Transactions on Robotics, 25(3), 658–669.CrossRefGoogle Scholar
  44. Theodorou, E., Buchli, J., & Schaal, S. (2010a). Reinforcement learning of motor skills in high dimensions: A path integral approach. In Proceedings of the IEEE international conference on robotics and automation (ICRA), Anchorage, US.Google Scholar
  45. Theodorou, E., Buchli, J., & Schaal, S. (2010b). A generalized path integral control approach to reinforcement learning. The Journal of Machine Learning Research, 11, 3137–3181.MathSciNetzbMATHGoogle Scholar
  46. Ugurlu, B., Hirabayashi, T., & Kawamura, A. (2009). A unified control frame for stable bipedal walking. In IEEE international conference on industrial electronics and control, Porto, Portugal (pp. 4167–4172).Google Scholar
  47. Ugurlu, B., Tsagarakis, N. G., Spyrakos-Papastravridis, E., & Caldwell, D. G. (2011). Compiant joint modification and real-time dynamic walking implementation on bipedal robot cCub. In Proceedings of the IEEE international conference on mechatronics, Istanbul, Turkey.Google Scholar
  48. Ugurlu, B., Saglia, J. A., Tsagarakis, N. G., Morfey, S., & Caldwell, D. G. (2014). Bipedal hopping pattern generation for passively compliant humanoids: Exploiting the resonance. IEEE Transactions on Industrial Electronics, 61(10), 5431–5443.CrossRefGoogle Scholar
  49. Wada, Y., & Sumita, K. (2004). A reinforcement learning scheme for acquisition of via-point representation of human motion. In Proceedings of the IEEE International Conference on Neural Networks, 2, 1109–1114.Google Scholar
  50. Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8(3–4), 229–256.zbMATHGoogle Scholar
  51. Wisse, M., Schwab, A. L., van der Linde, R. Q., & van der Helm, F. C. T. (2005). How to keep from falling forward: Elementary swing leg action for passive dynamic walkers. IEEE Transactions on Robotics, 21(3), 393–401.CrossRefGoogle Scholar
  52. Xiaoxiang, Y., & Iida, F. (2014). Minimalistic models of an energy-efficient vertical-hopping robot. IEEE Transactions on Industrial Electronics, 61(2), 1053–1062.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Dyson School of Design EngineeringImperial College LondonLondonUK
  2. 2.Department of Mechanical EngineeringOzyegin UniversityIstanbulTurkey
  3. 3.Department of Advanced RoboticsIstituto Italiano di TecnologiaGenoaItaly

Personalised recommendations