Skip to main content
Log in

Blind Hexapod Locomotion in Complex Terrain with Gait Adaptation Using Deep Reinforcement Learning and Classification

  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

We present a scalable two-level architecture for Hexapod locomotion through complex terrain without the use of exteroceptive sensors. Our approach assumes that the target complex terrain can be modeled by N discrete terrain distributions which capture individual difficulties of the target terrain. Expert policies (physical locomotion controllers) modeled by Artificial Neural Networks are trained independently in these individual scenarios using Deep Reinforcement Learning. These policies are then autonomously multiplexed during inference using a Recurrent Neural Network terrain classifier conditioned on the state history, giving an adaptive gait appropriate for the current terrain. We perform several tests to assess policy robustness by changing various parameters, such as contact, friction and actuator properties. We also show experiments of goal-based positional control of such a system and a way of selecting several gait criteria during deployment, giving us a complete solution for blind Hexapod locomotion in a practical setting. The Hexapod platform and all our experiments are modeled in the MuJoCo [1] physics simulator. Demonstrations are available in the supplementary video.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Vespignani, M., Friesen, J.M., SunSpiral, V., Bruce, J.: Design of superball v2, a compliant tensegrity robot for absorbing large impacts. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2865-2871 (2018). DOI 10.1109/IROS.2018.8594374

  2. Lample, G., Chaplot, D.S.: Playing FPS games with deep reinforcement learning. CoRRabs/1609.05521 (2016). URL http://arxiv.org/abs/1609.05521

  3. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR abs/1707.06347 (2017). URL http://arxiv.org/abs/1707.06347

  4. Wang, T., Liao, R., Ba, J., Fidler, S.: Nervenet: Learning structured policy with graph neural networks. In: International Conference on Learning Representations (2018). URL https://openreview.net/forum?id=S1sqHMZCb

  5. Peng, X.B., Berseth, G., van de Panne, M.: Terrain-adaptive locomotion skills using deep reinforcement learning. ACM Transactions on Graphics (Proc. SIGGRAPH 2016) 35(4) (2016)

  6. Bjelonic, M., Kottege, N., Homberger, T., Borges, P., Beckerle, P.: Chli, M.:Weaver: Hexapod robot for autonomous navigation on unstructured terrain. Journal of Field Robotics. 35, 1063–1079 (2018). https://doi.org/10.1002/rob.21795

    Article  Google Scholar 

  7. Yu, W., Turk, G., Liu, C.K.: Learning symmetry and low-energy locomotion. CoRRabs/1801.08093 (2018). URL http://arxiv.org/abs/1801.08093

  8. Boston dynamics, spot. https://www.bostondynamics.com/spot. Accessed: 16-10-2019

  9. Ijspeert, A.J.: Central pattern generators for locomotion control in animals and robots: A review. Neural networks: the official journal of the International Neural Network Society. 21(4), 642–653 (2008)

    Article  Google Scholar 

  10. Trossen robotics. https://www.trossenrobotics.com/. Accessed: 22-05-2010

  11. Isvara, Y., Rachmatullah, S., Mutijarsa, K., Prabakti, D.E., Pragitatama, W.: Terrain adaptation gait algorithm in a hexapod walking robot. In: 2014 13th International Conference on Control Automation Robotics Vision (ICARCV), pp. 1735-1739 (2014). DOI 10.1109/ICARCV.2014.7064578

  12. Open AI, Andrychowicz, M., Baker, B., Chociej, M., Józefowicz, R., McGrew, B., Pachocki, J.W., Pachocki, J., Petron, A., Plappert, M., Powell, G., Ray, A., Schneider, J., Sidor, S., Tobin, J., Welinder, P., Weng, L., Zaremba, W.: Learning dexterous in-hand manipulation. CoRR abs/1808.00177 (2018). URL http://arxiv.org/abs/1808.00177

  13. Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 31(5), 855–868 (2009). https://doi.org/10.1109/TPAMI.2008.137

    Article  Google Scholar 

  14. Kruijff, G., Kruijff-Korbayová, I., Keshavdas, S., Larochelle, B., Janícek, M., Colas, F., Liu, M., Pomerleau, F., Siegwart, R., Neerincx, M., Looije, R., Smets, N., Mioch, T., van Diggelen, J., Pirri, F., Gianni, M., Ferri, F., Menna, M., Worst, R., Linder, T., Tretyakov, V., Surmann, H., Svoboda, T., Reinštein, M., Zimmermann, K., Petříćek, T., Hlaváč, V.: Designing, developing, and deploying systems to support human-robot teams in disaster response. Advanced Robotics. 28(23), 1547–1570 (2014). https://doi.org/10.1080/01691864.2014.985335

    Article  Google Scholar 

  15. Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: S.A. Solla, T.K. Leen, K. Muller (eds.) Advances in Neural Information Processing Systems 12, pp. 1057-1063. MIT Press (2000). URL http://papers.nips.cc/paper/1713-policy-gradient-methods-for-reinforcement-learning-with-function-approximation.pdf

  16. Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. CoRRabs/1703.06907 (2017). URL http://arxiv.org/abs/1703.06907

  17. Perlin, K.: Improving noise. ACM Trans. Graph. 21(3), 681–682 (2002). https://doi.org/10.1145/566654.566636

    Article  Google Scholar 

  18. Wierstra, D., Schaul, T., Glasmachers, T., Sun, Y., Peters, J., Schmidhuber, J.: Natural evolution strategies. Journal of Machine Learning Research 15, 949-980 (2014). URL http://jmlr.org/papers/v15/wierstra14a.html

  19. Hutter, M., Gehring, C., Lauber, A., Gunther, F., Bellicoso, C.D., Tsounis, V., Fankhauser, P., Diethelm, R., Bachmann, S., Bloesch, M., Kolvenbach, H., Bjelonic, M., Isler, L., Meyer, K.: Anymal - toward legged robots for harsh environments. Advanced Robotics. 31(17), 918–931 (2017). https://doi.org/10.1080/01691864.2017.1378591

    Article  Google Scholar 

  20. Graves, A., Mohamed, A., Hinton, G.E.: Speech recognition with deep recurrent neural networks. CoRR abs/1303.5778 (2013). URL http://arxiv.org/abs/1303.5778

  21. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  22. Manoonpong, P., Parlitz, U., Wörgötter, F.: Neural control and adaptive neural forward models for insect-like, energy-e_cient, and adaptable locomotion of walking machines. Frontiers in neural circuits. 7, 12 (2013). https://doi.org/10.3389/fncir.2013.00012

    Article  Google Scholar 

  23. Ross, S., Gordon, G.J., Bagnell, J.A.: No-regret reductions for imitation learning and structured prediction. CoRR abs/1011.0686 (2010). URL http://arxiv.org/abs/1011.0686

  24. Pecka, M., Zimmermann, K., Reinstein, M., Svoboda, T.: Controlling robot morphology from incomplete measurements. CoRR abs/1612.02739 (2016). URL http://arxiv.org/abs/1612.02739

  25. Todorov, E., Erez, T., Tassa, Y.: Mujoco: A physics engine for model-based control. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems pp. 5026-5033 (2012)

  26. Čížek, P., Faigl, J.: On locomotion control using position feedback only in traversing rough terrains with hexapod crawling robot. IOP Conference Series: Materials Science and Engineering. 428, 012065 (2018). https://doi.org/10.1088/1757-899X/428/1/012065

    Article  Google Scholar 

  27. Sanchez-Gonzalez, A., Heess, N., Springenberg, J.T., Merel, J., Riedmiller, M.A., Hadsell, R., Battaglia, P.: Graph networks as learnable physics engines for inference and control. CoRR abs/1806.01242 (2018). URL http://arxiv.org/abs/1806.01242

  28. Saranli, U.: Rhex: A simple and highly mobile hexapod robot. The International Journal of Robotics Research. 20, 616–631 (2001). https://doi.org/10.1177/02783640122067570

    Article  Google Scholar 

  29. Xie, Z., Berseth, G., Clary, P., Hurst, J.W., van de Panne, M.: Feedback control for cassie with deep reinforcement learning. CoRR abs/1803.05580 (2018). URL http: //arxiv.org/abs/1803.05580

  30. Bitter lesson, rich sutton. http://www.incompleteideas.net/IncIdeas/BitterLesson. html. Accessed: 2019-04-18

Download references

Acknowledgements

The research leading to these results has received funding from the Czech Science Foundation under Project 17-08842S.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Teymur Azayev.

Electronic supplementary material

(MP4 22534 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Azayev, T., Zimmerman, K. Blind Hexapod Locomotion in Complex Terrain with Gait Adaptation Using Deep Reinforcement Learning and Classification. J Intell Robot Syst 99, 659–671 (2020). https://doi.org/10.1007/s10846-020-01162-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10846-020-01162-8

Keywords

Navigation