Abstract
Physics simulators have shown great promise for conveniently learning reinforcement learning policies in safe, unconstrained environments. However, transferring the acquired knowledge to the real world can be challenging due to the reality gap. To this end, several methods have been recently proposed to automatically tune simulator parameters with posterior distributions given real data, for use with domain randomization at training time. These approaches have been shown to work for various robotic tasks under different settings and assumptions. Nevertheless, existing literature lacks a thorough comparison of existing adaptive domain randomization methods with respect to transfer performance and real-data efficiency. This work presents an open benchmark for both offline and online methods (SimOpt, BayRn, DROID, DROPO), to investigate current limitations on multiple settings and tasks. We found that online methods are limited by the quality of the currently learned policy for the next iteration, while offline methods may sometimes fail when replaying trajectories in simulation with open-loop commands. The code used is publicly available at https://github.com/gabrieletiboni/adr-benchmark.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Antonova, R., Cruciani, S., Smith, C., Kragic, D.: Reinforcement learning for pivoting task (2017). arXiv Preprint: arXiv:1703.00472v1
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym (2016). arXiv:1606.01540v1
Chebotar, Y., Handa, A., Makoviychuk, V., Macklin, M., Issac, J., Ratliff, N., Fox, D.: Closing the sim-to-real loop: adapting simulation randomization with real world experience. In: ICRA (2019)
Chen, X., Hu, J., Jin, C., Li, L., Wang, L.: Understanding domain randomization for sim-to-real transfer. In: ICLR (2022)
Ding, Z., Tsai, Y., Lee, W.W., Huang, B.: Sim-to-real transfer for robotic manipulation with tactile sensory. In: IROS (2021)
Finn, C., Zhang, M., Fu, J., Tan, X., McCarthy, Z., Scharff, E., Levine, S.: Guided policy search code implementation (2016). http://rll.berkeley.edu/gps.software available from rll.berkeley.edu/gps
Hansen, N.: The CMA Evolution Strategy: A Comparing Review (2006)
James, S., Davison, A., Johns, E.: Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task. In: PMLR, pp. 334–343 (2017)
Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)
Mehta, B., Diaz, M., Golemo, F., Pal, C.J., Paull, L.: Active domain randomization. In: CoRL (2020)
Mehta, B., Handa, A., Fox, D., Ramos, F.: A user‘s guide to calibrating robotics simulators. In: CoRL (2020)
Muratore, F., Eilers, C., Gienger, M., Peters, J.: Data-efficient domain randomization with Bayesian optimization. IEEE ICRA Robot. Autom. Lett. 6(2), 911–918 (2021)
Muratore, F., Gienger, M., Peters, J.: Assessing transferability from simulation to reality for reinforcement learning. IEEE TPAMI 43(4), 1172–1183 (2021)
Muratore, F., Gruner, T., Wiese, F., Belousov, B., Gienger, M., Peters, J.: Neural posterior domain randomization. In: Faust, A., Hsu, D., Neumann, G. (eds.) Proceedings of the 5th Conference on Robot Learning and Machine Learning Research, Vol. 164, pp. 1532–1542. PMLR (2022)
Muratore, F., Ramos, F., Turk, G., Yu, W., Gienger, M., Peters, J.: Robot learning from randomized simulations: a review. Front. Robot. AI 9, 799–893 (2022)
OpenAI, Akkaya, I., Andrychowicz, M., Chociej, M., Litwin, M., McGrew, B., Petron, A., Paino, A., Plappert, M., Powell, G., Ribas, R., Schneider, J., Tezak, N., Tworek, J., Welinder, P., Weng, L., Yuan, Q., Zaremba, W., Zhang, L.: Solving Rubik’s cube with a robot hand (2019). arXiv Preprint: arXiv:1910.07113v1
Peng, X.B., Andrychowicz, M., Zaremba, W., Abbeel, P.: Sim-to-real transfer of robotic control with dynamics randomization. In: ICRA (2018)
Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., Dormann, N.: Stable-baselines3: Reliable reinforcement learning implementations. J. Mach. Learn. Res. 22(268), 1–8 (2021). http://jmlr.org/papers/v22/20-1364.html
Rajeswaran, A., Ghotra, S., Ravindran, B., Levine, S.: Epopt: learning robust neural network policies using model ensembles. In: ICLR (2017)
Ramos, F., Possas, R.C., Fox, D.: BayesSim: adaptive domain randomization via probabilistic inference for robotics simulators. In: RSS (2019)
Sadeghi, F., Levine, S.: CAD2RL: real single-image flight without a single real image. In: RSS (2017)
Tan, J., Zhang, T., Coumans, E., Iscen, A., Bai, Y., Hafner, D., Bohez, S., Vanhoucke, V.: Sim-to-real: learning agile locomotion for quadruped robots. In: RSS (2018)
Tiboni, G., Arndt, K., Kyrki, V.: DROPO: sim-to-real transfer with offline domain randomization (2022). arXiv Preprint: arXiv:2201.08434v1
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: IROS (2017)
Tsai, Y., Xu, H., Ding, Z., Zhang, C., Johns, E., Huang, B.: DROID: minimizing the reality gap using single-shot human demonstration. IEEE Robot. Autom. Lett. 6(2), 3168–3175 (2021)
Valassakis, E., Di Palo, N., Johns, E.: Coarse-to-fine for sim-to-real: sub-millimetre precision across wide task spaces. In: IROS (2021)
Vuong, Q., Vikram, S., Su, H., Gao, S., Christensen, H.: How to pick the domain randomization parameters for sim-to-real transfer of reinforcement learning policies? In: ICRA (2019)
Zhao, W., Queralta, J.P., Westerlund, T.: Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In: 2020 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 737–744. IEEE (2020)
Acknowledgments
We acknowledge the computational resources generously provided by HPC@POLITO and by the Aalto Science-IT project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tiboni, G., Arndt, K., Averta, G., Kyrki, V., Tommasi, T. (2023). Online vs. Offline Adaptive Domain Randomization Benchmark. In: Borja, P., Della Santina, C., Peternel, L., Torta, E. (eds) Human-Friendly Robotics 2022. HFR 2022. Springer Proceedings in Advanced Robotics, vol 26. Springer, Cham. https://doi.org/10.1007/978-3-031-22731-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-031-22731-8_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22730-1
Online ISBN: 978-3-031-22731-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)