Bridging the Reality Gap via Progressive Bayesian Optimisation

Yu, Chen; Rosendo, Andre

doi:10.1007/978-3-031-15226-9_17

Chen Yu¹⁵ &
Andre Rosendo¹⁵

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 530))

Included in the following conference series:

Climbing and Walking Robots Conference

1364 Accesses

Abstract

The recent rapid development of deep-learning-based control strategies has made the reality gap a critical issue at the forefront of robotics, especially for legged robots. We propose a novel system identification framework, Progressive Bayesian Optimisation (ProBO), to bridge the reality gap by tuning simulation parameters. Since dynamic locomotion trajectories are usually harder to narrow the reality gap than their static counterpart, we train a Gaussian process model with the easier trajectory data set and make it a prior to start the learning process of a harder one. We implement ProBO on a quadruped robot to narrow the reality gaps of a set of bounding gaits at different speeds. Results show that our methods can outperform all other alternatives after training the initial gait.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Schrittwieser, J., Hubert, T., Mandhane, A., Barekatain, M., Antonoglou, I., Silver, D.: Online and offline reinforcement learning by planning with a learned model. In: Advances in Neural Information Processing Systems, vol. 34 (2021)
Google Scholar
Song, S., et al.: Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation. J. Neuroeng. Rehabil. 18(1), 1–17 (2021)
Article Google Scholar
Ibarz, J., Tan, J., Finn, C., Kalakrishnan, M., Pastor, P., Levine, S.: How to train your robot with deep reinforcement learning: lessons we have learned. Int. J. Robot. Res. 40(4–5), 698–721 (2021)
Article Google Scholar
Peng, X.B., Andrychowicz, M., Zaremba, W., Abbeel, P.: Sim-to-real transfer of robotic control with dynamics randomization. In: 2018 IEEE international conference on robotics and automation (ICRA), pp. 3803–3810. IEEE (2018)
Google Scholar
Collins, J., Brown, R., Leitner, J., Howard, D.: Follow the gradient: crossing the reality gap using differentiable physics (realitygrad). arXiv preprint arXiv:2109.04674 (2021)
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30. IEEE (2017)
Google Scholar
James, S., Davison, A.J., Johns, E.: Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task. In: Conference on Robot Learning, pp. 334–343. PMLR (2017)
Google Scholar
Borrego, J., Figueiredo, R., Dehban, A., Moreno, P., Bernardino, A., Santos-Victor, J.: A generic visual perception domain randomisation framework for gazebo. In: 2018 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp. 237–242. IEEE (2018)
Google Scholar
Tan, J., et al.: Sim-to-real: learning agile locomotion for quadruped robots. arXiv preprint arXiv:1804.10332 (2018)
Hwangbo, J., et al.: Learning agile and dynamic motor skills for legged robots. Sci. Robot. 4(26), eaau5872 (2019)
Article Google Scholar
Siekmann, J., et al.: Learning memory-based control for human-scale bipedal locomotion. In: M. Toussaint, A. Bicchi, T. Hermans (eds.) Robotics: Science and Systems XVI, Virtual Event/Corvalis, Oregon, USA, 12–16 July 2020 (2020). https://doi.org/10.15607/RSS.2020.XVI.031
Kolev, S., Todorov, E.: Physically consistent state estimation and system identification for contacts. In: 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), pp. 1036–1043. IEEE (2015)
Google Scholar
Ramos, F., Possas, R., Fox, D.: BayesSim: adaptive domain randomization via probabilistic inference for robotics simulators. In: Proceedings of Robotics: Science and Systems. FreiburgimBreisgau, Germany (2019). https://doi.org/10.15607/RSS.2019.XV.029
Chebotar, Y., et al.: Closing the sim-to-real loop: adapting simulation randomization with real world experience. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 8973–8979. IEEE (2019)
Google Scholar
Yu, W., Tan, J., Liu, C.K., Turk, G.: Preparing for the unknown: learning a universal policy with online system identification. In: Amato, N.M., Srinivasa, S.S., Ayanian, N., Kuindersma, S. (eds.) Robotics: Science and Systems XIII, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA, 12–16 July 2017 (2017). https://doi.org/10.15607/RSS.2017.XIII.048. http://www.roboticsproceedings.org/rss13/p48.html
Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Google Scholar
Antonova, R., Rai, A., Li, T., Kragic, D.: Bayesian optimization in variational latent spaces with dynamic compression. In: Conference on Robot Learning, pp. 456–465. PMLR (2020)
Google Scholar
Müller, S., von Rohr, A., Trimpe, S.: Local policy search with Bayesian optimization. In: Advances in Neural Information Processing Systems, vol. 34 (2021)
Google Scholar
Katz, B., Di Carlo, J., Kim, S.: Mini cheetah: a platform for pushing the limits of dynamic quadruped control. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 6295–6301. IEEE (2019)
Google Scholar
Storn, R., Price, K.: Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11(4), 341–359 (1997)
Article MathSciNet Google Scholar
Collins, J., Brown, R., Leitner, J., Howard, D.: Traversing the reality gap via simulator tuning. arXiv preprint arXiv:2003.01369 (2020)
Srinivas, N., Krause, A., Kakade, S., Seeger, M.: Gaussian process optimization in the bandit setting: no regret and experimental design. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML 2010, pp. 1015–1022. Omnipress, Madison (2010)
Google Scholar
Coumans, E., Bai, Y.: Pybullet, a python module for physics simulation for games, robotics and machine learning (2016)
Google Scholar
Rajeswaran, A., Ghotra, S., Ravindran, B., Levine, S.: EPOpt: learning robust neural network policies using model ensembles. In: 5th International Conference on Learning Representations. OpenReview.net (2017)
Google Scholar
Tan, J., et al.: Sim-to-real: learning agile locomotion for quadruped robots. In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018). https://doi.org/10.15607/RSS.2018.XIV.010
Tassa, Y., Erez, T., Todorov, E.: Synthesis and stabilization of complex behaviors through online trajectory optimization. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4906–4913. IEEE (2012)
Google Scholar
Huang, A.S., Olson, E., Moore, D.C.: LCM: lightweight communications and marshalling. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4057–4062. IEEE (2010)
Google Scholar
Auger, A., Hansen, N.: Tutorial CMA-ES: evolution strategies and covariance matrix adaptation. In: Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation, pp. 827–848 (2012)
Google Scholar
Le Goff, L.K., et al.: Sample and time efficient policy learning with CMA-ES and Bayesian Optimisation. In: ALIFE 2020: The 2020 Conference on Artificial Life, pp. 432–440 (2020). https://doi.org/10.1162/isal_a_00299
Lim, V., et al.: Planar robot casting with real2sim2real self-supervised learning. In: 2022 International Conference on Robotics and Automation (ICRA). IEEE (2022)
Google Scholar

Download references

Author information

Authors and Affiliations

ShanghaiTech University, Shanghai, China
Chen Yu & Andre Rosendo

Authors

Chen Yu
View author publications
You can also search for this author in PubMed Google Scholar
Andre Rosendo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Chen Yu or Andre Rosendo .

Editor information

Editors and Affiliations

Faculty of Sciences and Technology, University of the Azores, Ponta Delgada, Portugal
José M. Cascalho
School of Engineering, London South Bank University, London, UK
Mohammad Osman Tokhi
School of Engineering, Polytechnic Institute of Porto, Porto, Portugal
Manuel F. Silva
Faculty of Sciences and Technology, University of the Azores, Ponta Delgada, Portugal
Armando Mendes
Mechanical, Materials and Manufacturing Engineering (M3), Faculty of Engineering, University of Nottingham, Nottingham, UK
Khaled Goher
Faculty of Sciences and Technology, University of the Azores, Ponta Delgada, Portugal
Matthias Funk

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, C., Rosendo, A. (2023). Bridging the Reality Gap via Progressive Bayesian Optimisation. In: Cascalho, J.M., Tokhi, M.O., Silva, M.F., Mendes, A., Goher, K., Funk, M. (eds) Robotics in Natural Settings. CLAWAR 2022. Lecture Notes in Networks and Systems, vol 530. Springer, Cham. https://doi.org/10.1007/978-3-031-15226-9_17

Download citation

DOI: https://doi.org/10.1007/978-3-031-15226-9_17
Published: 25 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15225-2
Online ISBN: 978-3-031-15226-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Bridging the Reality Gap via Progressive Bayesian Optimisation