Skip to main content

Bridging the Reality Gap via Progressive Bayesian Optimisation

  • Conference paper
  • First Online:
Robotics in Natural Settings (CLAWAR 2022)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 530))

Included in the following conference series:

  • 1364 Accesses

Abstract

The recent rapid development of deep-learning-based control strategies has made the reality gap a critical issue at the forefront of robotics, especially for legged robots. We propose a novel system identification framework, Progressive Bayesian Optimisation (ProBO), to bridge the reality gap by tuning simulation parameters. Since dynamic locomotion trajectories are usually harder to narrow the reality gap than their static counterpart, we train a Gaussian process model with the easier trajectory data set and make it a prior to start the learning process of a harder one. We implement ProBO on a quadruped robot to narrow the reality gaps of a set of bounding gaits at different speeds. Results show that our methods can outperform all other alternatives after training the initial gait.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Schrittwieser, J., Hubert, T., Mandhane, A., Barekatain, M., Antonoglou, I., Silver, D.: Online and offline reinforcement learning by planning with a learned model. In: Advances in Neural Information Processing Systems, vol. 34 (2021)

    Google Scholar 

  2. Song, S., et al.: Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation. J. Neuroeng. Rehabil. 18(1), 1–17 (2021)

    Article  Google Scholar 

  3. Ibarz, J., Tan, J., Finn, C., Kalakrishnan, M., Pastor, P., Levine, S.: How to train your robot with deep reinforcement learning: lessons we have learned. Int. J. Robot. Res. 40(4–5), 698–721 (2021)

    Article  Google Scholar 

  4. Peng, X.B., Andrychowicz, M., Zaremba, W., Abbeel, P.: Sim-to-real transfer of robotic control with dynamics randomization. In: 2018 IEEE international conference on robotics and automation (ICRA), pp. 3803–3810. IEEE (2018)

    Google Scholar 

  5. Collins, J., Brown, R., Leitner, J., Howard, D.: Follow the gradient: crossing the reality gap using differentiable physics (realitygrad). arXiv preprint arXiv:2109.04674 (2021)

  6. Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30. IEEE (2017)

    Google Scholar 

  7. James, S., Davison, A.J., Johns, E.: Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task. In: Conference on Robot Learning, pp. 334–343. PMLR (2017)

    Google Scholar 

  8. Borrego, J., Figueiredo, R., Dehban, A., Moreno, P., Bernardino, A., Santos-Victor, J.: A generic visual perception domain randomisation framework for gazebo. In: 2018 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp. 237–242. IEEE (2018)

    Google Scholar 

  9. Tan, J., et al.: Sim-to-real: learning agile locomotion for quadruped robots. arXiv preprint arXiv:1804.10332 (2018)

  10. Hwangbo, J., et al.: Learning agile and dynamic motor skills for legged robots. Sci. Robot. 4(26), eaau5872 (2019)

    Article  Google Scholar 

  11. Siekmann, J., et al.: Learning memory-based control for human-scale bipedal locomotion. In: M. Toussaint, A. Bicchi, T. Hermans (eds.) Robotics: Science and Systems XVI, Virtual Event/Corvalis, Oregon, USA, 12–16 July 2020 (2020). https://doi.org/10.15607/RSS.2020.XVI.031

  12. Kolev, S., Todorov, E.: Physically consistent state estimation and system identification for contacts. In: 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), pp. 1036–1043. IEEE (2015)

    Google Scholar 

  13. Ramos, F., Possas, R., Fox, D.: BayesSim: adaptive domain randomization via probabilistic inference for robotics simulators. In: Proceedings of Robotics: Science and Systems. FreiburgimBreisgau, Germany (2019). https://doi.org/10.15607/RSS.2019.XV.029

  14. Chebotar, Y., et al.: Closing the sim-to-real loop: adapting simulation randomization with real world experience. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 8973–8979. IEEE (2019)

    Google Scholar 

  15. Yu, W., Tan, J., Liu, C.K., Turk, G.: Preparing for the unknown: learning a universal policy with online system identification. In: Amato, N.M., Srinivasa, S.S., Ayanian, N., Kuindersma, S. (eds.) Robotics: Science and Systems XIII, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA, 12–16 July 2017 (2017). https://doi.org/10.15607/RSS.2017.XIII.048. http://www.roboticsproceedings.org/rss13/p48.html

  16. Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, vol. 25 (2012)

    Google Scholar 

  17. Antonova, R., Rai, A., Li, T., Kragic, D.: Bayesian optimization in variational latent spaces with dynamic compression. In: Conference on Robot Learning, pp. 456–465. PMLR (2020)

    Google Scholar 

  18. Müller, S., von Rohr, A., Trimpe, S.: Local policy search with Bayesian optimization. In: Advances in Neural Information Processing Systems, vol. 34 (2021)

    Google Scholar 

  19. Katz, B., Di Carlo, J., Kim, S.: Mini cheetah: a platform for pushing the limits of dynamic quadruped control. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 6295–6301. IEEE (2019)

    Google Scholar 

  20. Storn, R., Price, K.: Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11(4), 341–359 (1997)

    Article  MathSciNet  Google Scholar 

  21. Collins, J., Brown, R., Leitner, J., Howard, D.: Traversing the reality gap via simulator tuning. arXiv preprint arXiv:2003.01369 (2020)

  22. Srinivas, N., Krause, A., Kakade, S., Seeger, M.: Gaussian process optimization in the bandit setting: no regret and experimental design. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML 2010, pp. 1015–1022. Omnipress, Madison (2010)

    Google Scholar 

  23. Coumans, E., Bai, Y.: Pybullet, a python module for physics simulation for games, robotics and machine learning (2016)

    Google Scholar 

  24. Rajeswaran, A., Ghotra, S., Ravindran, B., Levine, S.: EPOpt: learning robust neural network policies using model ensembles. In: 5th International Conference on Learning Representations. OpenReview.net (2017)

    Google Scholar 

  25. Tan, J., et al.: Sim-to-real: learning agile locomotion for quadruped robots. In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018). https://doi.org/10.15607/RSS.2018.XIV.010

  26. Tassa, Y., Erez, T., Todorov, E.: Synthesis and stabilization of complex behaviors through online trajectory optimization. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4906–4913. IEEE (2012)

    Google Scholar 

  27. Huang, A.S., Olson, E., Moore, D.C.: LCM: lightweight communications and marshalling. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4057–4062. IEEE (2010)

    Google Scholar 

  28. Auger, A., Hansen, N.: Tutorial CMA-ES: evolution strategies and covariance matrix adaptation. In: Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation, pp. 827–848 (2012)

    Google Scholar 

  29. Le Goff, L.K., et al.: Sample and time efficient policy learning with CMA-ES and Bayesian Optimisation. In: ALIFE 2020: The 2020 Conference on Artificial Life, pp. 432–440 (2020). https://doi.org/10.1162/isal_a_00299

  30. Lim, V., et al.: Planar robot casting with real2sim2real self-supervised learning. In: 2022 International Conference on Robotics and Automation (ICRA). IEEE (2022)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Chen Yu or Andre Rosendo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yu, C., Rosendo, A. (2023). Bridging the Reality Gap via Progressive Bayesian Optimisation. In: Cascalho, J.M., Tokhi, M.O., Silva, M.F., Mendes, A., Goher, K., Funk, M. (eds) Robotics in Natural Settings. CLAWAR 2022. Lecture Notes in Networks and Systems, vol 530. Springer, Cham. https://doi.org/10.1007/978-3-031-15226-9_17

Download citation

Publish with us

Policies and ethics