Abstract
In order to strike a balance between achieving desired velocities and minimizing energy consumption, legged animals have the ability to adopt the appropriate gait pattern and seamlessly transition to another if needed. This ability makes them more versatile and efficient when traversing natural terrains, and more suitable for long treks. In the same way, it is meaningful and important for quadruped robots to master this ability. To achieve this goal, we propose an effective gait-heuristic reinforcement learning framework in which multiple gait locomotion and smooth gait transitions automatically emerge to reach target velocities while minimizing energy consumption. We incorporate a novel trajectory generator with explicit gait information as a memory mechanism into the deep reinforcement learning framework. This allows the quadruped robot to adopt reliable and distinct gait patterns while benefiting from a warm start provided by the trajectory generator. Furthermore, we investigate the key factors contributing to the emergence of multiple gait locomotion. We tested our framework on a closed-chain quadruped robot and demonstrated that the robot can change its gait patterns, such as standing, walking, and trotting, to adopt the most energy-efficient gait at a given speed. Lastly, we deploy our learned controller to a quadruped robot and demonstrate the energy efficiency and robustness of our method.
Similar content being viewed by others
Data Availibility Statement
The data that support the findings of this study are available from the corresponding author, Wei Wang, upon reasonable request.
Notes
[Online] Available: https://youtu.be/rf9imDqWTB4.
References
Xu, J., Tian, Y., Ma, P., Rus, D., Sueda, S., & Matusik, W. (2020). Prediction-guided multi-objective reinforcement learning for continuous robot control. In International conference on machine learning, Vienna, Austria (pp. 10607–10616).
Alexander, R. M. (1984). The gaits of bipedal and quadrupedal animals. The International Journal of Robotics Research, 3(2), 49–59.
Xi, W., Yesilevskiy, Y., & Remy, C. D. (2016). Selecting gaits for economical locomotion of legged robots. The International Journal of Robotics Research, 35(9), 1140–1154.
Polet, D. T., & Bertram, J. E. (2019). An inelastic quadrupedal model discovers four-beat walking, two-beat running, and pseudo-elastic actuation as energetically optimal. PLoS Computational Biology, 15(11), e1007444.
Peng, X. B., Berseth, G., Yin, K., & Van De Panne, M. (2017). Deeploco: Dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Transactions on Graphics (TOG), 36(4), 1–13.
Amatucci, L., Kim, J. H., Hwangbo, J., & Park, H. W. (2022). Monte carlo tree search gait planner for non-gaited legged system control. In International conference on robotics and automation (ICRA), IEEE, Philadelphia (PA), USA (pp. 4701–4707).
Yang, Y., Zhang, T., Coumans, E., Tan, J., & Boots, B. (2022). Fast and efficient locomotion via learned gait transitions. In Conference on robot learning, PMLR, Auckland, New Zealand (pp. 773–783).
Xu, S., Zhu, L., & Ho, C. P. (2022). Learning efficient and robust multi-modal quadruped locomotion: a hierarchical approach. In International conference on robotics and automation (ICRA), IEEE, Philadelphia (PA), USA (pp. 4649–4655).
Wei, L., Li, Y., Ai, Y., Wu, Y., Xu, H., Wang, W., & Hu, G. (2023). Learning multiple-gait quadrupedal locomotion via hierarchical reinforcement learning. International Journal of Precision Engineering and Manufacturing, 24(9), 1599–1613.
Hwangbo, J., Lee, J., Dosovitskiy, A., Bellicoso, D., Tsounis, V., Koltun, V., & Hutter, M. (2019). Learning agile and dynamic motor skills for legged robots. Science Robotics, 4(26), eaau5872.
Kumar, A., Fu, Z., Pathak, D., & Malik, J. (2021). Rma: Rapid motor adaptation for legged robots. arXiv preprint arXiv:2107.04034.
Fu, Z., Kumar, A., Malik, J., & Pathak, D. (2022). Minimizing energy consumption leads to the emergence of gaits in legged robots. In Conference on robot learning (CoRL), PMLR, London, UK (pp. 928–937).
Peng, X. B., Ma, Z., Abbeel, P., Levine, S., & Kanazawa, A. (2021). Amp: Adversarial motion priors for stylized physics-based character control. ACM Transactions on Graphics (ToG), 40(4), 1–20.
Escontrela, A., Peng, X. B., Yu, W., Zhang, T., Iscen, A., Goldberg, K., & Abbeel, P. (2022). Adversarial motion priors make good substitutes for complex reward functions. In International conference on intelligent robots and systems (IROS), IEEE, Kyoto, Japan (pp. 25–32).
Shao, Y., Jin, Y., Liu, X., He, W., Wang, H., & Yang, W. (2021). Learning free gait transition for quadruped robots via phase-guided controller. IEEE Robotics and Automation Letters, 7(2), 1230–1237.
Li, C., Blaes, S., Kolev, P., Vlastelica, M., Frey, J., & Martius. (2023). Versatile skill control via self-supervised adversarial imitation of unlabeled mixed motions. In International conference on robotics and automation (ICRA), IEEE, London, UK (pp. 2944–2950).
Iscen, A., Caluwaerts, K., Tan, J., Zhang, T., Coumans, E., Sindhwani, V., & Vanhoucke, V. (2018). Policies modulating trajectory generators. In Conference on robot learning (CoRL), PMLR, Zürich, Switzerland (pp. 916–926).
Lee, J., Hwangbo, J., Wellhausen, L., Koltun, V., & Hutter, M. (2020). Learning quadrupedal locomotion over challenging terrain. Science Robotics, 5(47), eabc5986.
Shi, H., Zhou, B., Zeng, H., Wang, F., Dong, Y., Li, J., & Meng, M. Q. H. (2022). Reinforcement learning with evolutionary trajectory generator: A general approach for quadrupedal locomotion. IEEE Robotics and Automation Letters, 7(2), 3085–3092.
Tirumala, S., Gubbi, S., Paigwar, K., Sagi, A., Joglekar, A., Bhatnagar, S., Ghosal, A., Amrutur, B., & Kolathaya, S. (2020). Learning stable manoeuvres in quadruped robots from expert demonstrations. In International conference on robot and human interactive communication (RO-MAN), IEEE, Naples, Italy (pp. 1107–1112).
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
Kenneally, G., De, A., & Koditschek, D. E. (2016). Design principles for a family of direct-drive legged robots. IEEE Robotics and Automation Letters, 1(2), 900–907.
Miki, T., Lee, J., Hwangbo, J., Wellhausen, L., Koltun, V., & Hutter, M. (2022). Learning robust perceptive locomotion for quadrupedal robots in the wild. Science Robotics, 7(62), eabk2822.
Yang, Y., Caluwaerts, K., Iscen, A., Zhang, T., Tan, J., & Sindhwani, V. (2020). Data efficient reinforcement learning for legged robots. In Conference on robot learning (CoRL), PMLR, London, UK (pp. 1–10).
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., & Abbeel, P. (2022). Domain randomization for transferring deep neural networks from simulation to the real world. In International conference on intelligent robots and systems (IROS), IEEE, Vancouver, Canada (pp. 23–30).
Peng, X. B., Coumans, E., Zhang, T., Lee, T. W., Tan, J., & Levine, S. (2020). Learning agile robotic locomotion skills by imitating animals. arXiv preprint arXiv:2004.00784.
Coumans, E., & Bai, Y. (2016). Pybullet, a Python module for physics simulation for games, robotics and machine learning. https://pybullet.org
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
On behalf of all authors, the corresponding author states that there is no Conflict of interest.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary file 1 (mp4 222274 KB)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wei, L., Zou, J., Yu, X. et al. Economical Quadrupedal Multi-Gait Locomotion via Gait-Heuristic Reinforcement Learning. J Bionic Eng (2024). https://doi.org/10.1007/s42235-024-00517-3
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42235-024-00517-3