Abstract
The recent rapid development of deep-learning-based control strategies has made the reality gap a critical issue at the forefront of robotics, especially for legged robots. We propose a novel system identification framework, Progressive Bayesian Optimisation (ProBO), to bridge the reality gap by tuning simulation parameters. Since dynamic locomotion trajectories are usually harder to narrow the reality gap than their static counterpart, we train a Gaussian process model with the easier trajectory data set and make it a prior to start the learning process of a harder one. We implement ProBO on a quadruped robot to narrow the reality gaps of a set of bounding gaits at different speeds. Results show that our methods can outperform all other alternatives after training the initial gait.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Schrittwieser, J., Hubert, T., Mandhane, A., Barekatain, M., Antonoglou, I., Silver, D.: Online and offline reinforcement learning by planning with a learned model. In: Advances in Neural Information Processing Systems, vol. 34 (2021)
Song, S., et al.: Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation. J. Neuroeng. Rehabil. 18(1), 1–17 (2021)
Ibarz, J., Tan, J., Finn, C., Kalakrishnan, M., Pastor, P., Levine, S.: How to train your robot with deep reinforcement learning: lessons we have learned. Int. J. Robot. Res. 40(4–5), 698–721 (2021)
Peng, X.B., Andrychowicz, M., Zaremba, W., Abbeel, P.: Sim-to-real transfer of robotic control with dynamics randomization. In: 2018 IEEE international conference on robotics and automation (ICRA), pp. 3803–3810. IEEE (2018)
Collins, J., Brown, R., Leitner, J., Howard, D.: Follow the gradient: crossing the reality gap using differentiable physics (realitygrad). arXiv preprint arXiv:2109.04674 (2021)
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30. IEEE (2017)
James, S., Davison, A.J., Johns, E.: Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task. In: Conference on Robot Learning, pp. 334–343. PMLR (2017)
Borrego, J., Figueiredo, R., Dehban, A., Moreno, P., Bernardino, A., Santos-Victor, J.: A generic visual perception domain randomisation framework for gazebo. In: 2018 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp. 237–242. IEEE (2018)
Tan, J., et al.: Sim-to-real: learning agile locomotion for quadruped robots. arXiv preprint arXiv:1804.10332 (2018)
Hwangbo, J., et al.: Learning agile and dynamic motor skills for legged robots. Sci. Robot. 4(26), eaau5872 (2019)
Siekmann, J., et al.: Learning memory-based control for human-scale bipedal locomotion. In: M. Toussaint, A. Bicchi, T. Hermans (eds.) Robotics: Science and Systems XVI, Virtual Event/Corvalis, Oregon, USA, 12–16 July 2020 (2020). https://doi.org/10.15607/RSS.2020.XVI.031
Kolev, S., Todorov, E.: Physically consistent state estimation and system identification for contacts. In: 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), pp. 1036–1043. IEEE (2015)
Ramos, F., Possas, R., Fox, D.: BayesSim: adaptive domain randomization via probabilistic inference for robotics simulators. In: Proceedings of Robotics: Science and Systems. FreiburgimBreisgau, Germany (2019). https://doi.org/10.15607/RSS.2019.XV.029
Chebotar, Y., et al.: Closing the sim-to-real loop: adapting simulation randomization with real world experience. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 8973–8979. IEEE (2019)
Yu, W., Tan, J., Liu, C.K., Turk, G.: Preparing for the unknown: learning a universal policy with online system identification. In: Amato, N.M., Srinivasa, S.S., Ayanian, N., Kuindersma, S. (eds.) Robotics: Science and Systems XIII, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA, 12–16 July 2017 (2017). https://doi.org/10.15607/RSS.2017.XIII.048. http://www.roboticsproceedings.org/rss13/p48.html
Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Antonova, R., Rai, A., Li, T., Kragic, D.: Bayesian optimization in variational latent spaces with dynamic compression. In: Conference on Robot Learning, pp. 456–465. PMLR (2020)
Müller, S., von Rohr, A., Trimpe, S.: Local policy search with Bayesian optimization. In: Advances in Neural Information Processing Systems, vol. 34 (2021)
Katz, B., Di Carlo, J., Kim, S.: Mini cheetah: a platform for pushing the limits of dynamic quadruped control. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 6295–6301. IEEE (2019)
Storn, R., Price, K.: Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11(4), 341–359 (1997)
Collins, J., Brown, R., Leitner, J., Howard, D.: Traversing the reality gap via simulator tuning. arXiv preprint arXiv:2003.01369 (2020)
Srinivas, N., Krause, A., Kakade, S., Seeger, M.: Gaussian process optimization in the bandit setting: no regret and experimental design. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML 2010, pp. 1015–1022. Omnipress, Madison (2010)
Coumans, E., Bai, Y.: Pybullet, a python module for physics simulation for games, robotics and machine learning (2016)
Rajeswaran, A., Ghotra, S., Ravindran, B., Levine, S.: EPOpt: learning robust neural network policies using model ensembles. In: 5th International Conference on Learning Representations. OpenReview.net (2017)
Tan, J., et al.: Sim-to-real: learning agile locomotion for quadruped robots. In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018). https://doi.org/10.15607/RSS.2018.XIV.010
Tassa, Y., Erez, T., Todorov, E.: Synthesis and stabilization of complex behaviors through online trajectory optimization. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4906–4913. IEEE (2012)
Huang, A.S., Olson, E., Moore, D.C.: LCM: lightweight communications and marshalling. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4057–4062. IEEE (2010)
Auger, A., Hansen, N.: Tutorial CMA-ES: evolution strategies and covariance matrix adaptation. In: Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation, pp. 827–848 (2012)
Le Goff, L.K., et al.: Sample and time efficient policy learning with CMA-ES and Bayesian Optimisation. In: ALIFE 2020: The 2020 Conference on Artificial Life, pp. 432–440 (2020). https://doi.org/10.1162/isal_a_00299
Lim, V., et al.: Planar robot casting with real2sim2real self-supervised learning. In: 2022 International Conference on Robotics and Automation (ICRA). IEEE (2022)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yu, C., Rosendo, A. (2023). Bridging the Reality Gap via Progressive Bayesian Optimisation. In: Cascalho, J.M., Tokhi, M.O., Silva, M.F., Mendes, A., Goher, K., Funk, M. (eds) Robotics in Natural Settings. CLAWAR 2022. Lecture Notes in Networks and Systems, vol 530. Springer, Cham. https://doi.org/10.1007/978-3-031-15226-9_17
Download citation
DOI: https://doi.org/10.1007/978-3-031-15226-9_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15225-2
Online ISBN: 978-3-031-15226-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)