Skip to main content

Deep Reinforcement Learning

  • Chapter
  • First Online:
Machine Learning Safety

Abstract

This chapter consider another important application of machine learning to robotics, i.e., the utilisation of deep reinforcement learning agent for robot motion planning and control. We will first present some preliminaries on training DRL policy in robotics, followed by the discussion on the sample efficiency concerning the sufficiency of training (Sect. 13.3) and the introduction of several statistical methods for evaluation (Sect. 13.4). Afterwards, we will discuss how to formally express the properties (Sect. 13.5) and then focus on reusing the verification tools for convolutional neural network to work with deep reinforcement learning, by considering the verification of policy generalisation (Sect. 13.6), the verification of state-based policy robustness (Sect. 13.7), and the verification of temporal policy robustness (Sect. 13.8). In addition, we will discuss how to address the well-known Sim-to-Real challenge in robotics with the verification techniques (Sect. 13.9).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 79.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dario Amodei, Chris Olah, Jacob Steinhardt, Paul F. Christiano, John Schulman, and Dan Mané. Concrete problems in AI safety. CoRR, abs/1606.06565, 2016.

    Google Scholar 

  2. Karol Arndt, Murtaza Hazara, Ali Ghadirzadeh, and Ville Kyrki. Meta reinforcement learning for sim-to-real domain adaptation. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 2725–2731, 2020.

    Google Scholar 

  3. Stephanie C. Y. Chan, Sam Fishman, John F. Canny, Anoop Korattikara, and Sergio Guadarrama. Measuring the reliability of reinforcement learning algorithms. CoRR, abs/1912.05663, 2019.

    Google Scholar 

  4. Paul Christiano, Zain Shah, Igor Mordatch, Jonas Schneider, Trevor Blackwell, Joshua Tobin, Pieter Abbeel, and Wojciech Zaremba. Transfer from simulation to real world through learning deep inverse dynamics model. arXiv preprint arXiv:1610.03518, 2016.

    Google Scholar 

  5. Edmund M Clarke Jr, Orna Grumberg, Daniel Kroening, Doron Peled, and Helmut Veith. Model checking. The MIT Press, 2018.

    Google Scholar 

  6. Christian Dehnert, Sebastian Junges, Joost-Pieter Katoen, and Matthias Volk. A Storm is coming: A modern probabilistic model checker. In Rupak Majumdar and Viktor Kunĉak, editors, Computer Aided Verification, volume 10427 of LNCS, pages 592–600, Cham, 2017. Springer.

    Google Scholar 

  7. Yi Dong, Xingyu Zhao, and Xiaowei Huang. Dependability analysis of deep reinforcement learning based robotics and autonomous systems. CoRR, abs/2109.06523, 2021.

    Google Scholar 

  8. Alexey Dosovitskiy, Germán Ros, Felipe Codevilla, Antonio M. López, and Vladlen Koltun. CARLA: an open urban driving simulator. In 1st Annual Conference on Robot Learning, CoRL 2017, Mountain View, California, USA, November 13-15, 2017, Proceedings, volume 78 of Proceedings of Machine Learning Research, pages 1–16. PMLR, 2017.

    Google Scholar 

  9. Ilenia Epifani, Carlo Ghezzi, Raffaela Mirandola, and Giordano Tamburrelli. Model evolution by run-time parameter adaptation. In Proc. of the 31st Int. Conf. on Software Engineering, ICSE ’09, pages 111–121, Washington, DC, USA, 2009. IEEE Computer Society.

    Google Scholar 

  10. Fadri Furrer, Michael Burri, Markus Achtelik, and Roland Siegwart. RotorS—A Modular Gazebo MAV Simulator Framework, pages 595–625. Springer International Publishing, Cham, 2016.

    Google Scholar 

  11. Xiaowei Huang, Daniel Kroening, Wenjie Ruan, James Sharp, Youcheng Sun, Emese Thamo, Min Wu, and Xinping Yi. A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability. Computer Science Review, 37:100270, 2020.

    Article  MathSciNet  MATH  Google Scholar 

  12. Manuel Kaspar, Juan D. Muñoz Osorio, and Juergen Bock. Sim2real transfer for reinforcement learning without dynamics randomization. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4383–4388, 2020.

    Google Scholar 

  13. K. Kristinsson and G.A. Dumont. System identification and control using genetic algorithms. IEEE Transactions on Systems, Man, and Cybernetics, 22(5):1033–1046, 1992.

    Article  MATH  Google Scholar 

  14. Marta Kwiatkowska, Gethin Norman, and David Parker. PRISM 4.0: Verification of probabilistic real-time systems. In Ganesh Gopalakrishnan and Shaz Qadeer, editors, Computer Aided Verification, volume 6806 of LNCS, pages 585–591, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg.

    Google Scholar 

  15. Marta Kwiatkowska, Gethin Norman, and David Parker. Probabilistic Model Checking: Advances and Applications. In Rolf Drechsler, editor, Formal System Verification: State-of the-Art and Future Trends, pages 73–121. Springer, Cham, 2018.

    Google Scholar 

  16. Jianlin Li, Jiangchao Liu, Pengfei Yang, Liqian Chen, Xiaowei Huang, and Lijun Zhang. Analyzing deep neural networks with symbolic propagation: Towards higher precision and faster verification. In SAS2019, 2019.

    Google Scholar 

  17. Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement learning. In ICLR’16, 2016.

    Google Scholar 

  18. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.

    Google Scholar 

  19. Fabio Muratore, Christian Eilers, Michael Gienger, and Jan Peters. Bayesian domain randomization for sim-to-real transfer. CoRR, abs/2003.02471, 2020.

    Google Scholar 

  20. Ramya Ramakrishnan, Ece Kamar, Debadeepta Dey, Eric Horvitz, and Julie Shah. Blind spot detection for safe sim-to-real transfer. J. Artif. Intell. Res., 67:191–234, 2020.

    Article  MathSciNet  MATH  Google Scholar 

  21. Robotis. Robotis(2019) turtlebot3 – e-manual, waffle pi. [Online] https://emanual.robotis.com/docs/en/platform/turtlebot3/overview/. (Accessed on 02 August 2021).

  22. Wenjie Ruan, Xiaowei Huang, and Marta Kwiatkowska. Reachability analysis of deep neural networks with provable guarantees. In IJCAI, pages 2651–2659, 2018.

    Google Scholar 

  23. Shital Shah, Debadeepta Dey, Chris Lovett, and Ashish Kapoor. Airsim: High-fidelity visual and physical simulation for autonomous vehicles. In Marco Hutter and Roland Siegwart, editors, Field and Service Robotics, pages 621–635, Cham, 2018. Springer International Publishing.

    Google Scholar 

  24. Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction. MIT press, 2018.

    Google Scholar 

  25. René Traoré, Hugo Caselles-Dupré, Timothée Lesort, Te Sun, Natalia Díaz Rodríguez, and David Filliat. Continual reinforcement learning deployed in real-life using policy distillation and sim2real transfer. CoRR, abs/1906.04452, 2019.

    Google Scholar 

  26. Jingkang Wang, Yang Liu, and Bo Li. Reinforcement learning with perturbed rewards. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, pages 6202–6209. AAAI Press, 2020.

    Google Scholar 

  27. Gellért Weisz, Philip Amortila, and Csaba Szepesvári. Exponential lower bounds for planning in mdps with linearly-realizable optimal action-value functions. In Vitaly Feldman, Katrina Ligett, and Sivan Sabato, editors, Algorithmic Learning Theory, 16-19 March 2021, Virtual Conference, Worldwide, volume 132 of Proceedings of Machine Learning Research, pages 1237–1264. PMLR, 2021.

    Google Scholar 

  28. Min Wu, Matthew Wicker, Wenjie Ruan, Xiaowei Huang, and Marta Kwiatkowska. A game-based approximate verification of deep neural networks with provable guarantees. Theor. Comput. Sci., 807:298–329, 2020.

    Article  MathSciNet  MATH  Google Scholar 

  29. Yang Yu. Towards sample efficient reinforcement learning. In IJCAI, pages 5739–5743, 2018.

    Google Scholar 

  30. Wenshuai Zhao, Jorge Peña Queralta, Li Qingqing, and Tomi Westerlund. Ubiquitous distributed deep reinforcement learning at the edge: Analyzing byzantine agents in discrete action spaces. Procedia Computer Science, 177:324–329, 2020. The 11th International Conference on Emerging Ubiquitous Systems and Pervasive Networks (EUSPN 2020) / The 10th International Conference on Current and Future Trends of Information and Communication Technologies in Healthcare (ICTH 2020) / Affiliated Workshops.

    Google Scholar 

  31. Wenshuai Zhao, Jorge Peña Queralta, Li Qingqing, and Tomi Westerlund. Towards closing the sim-to-real gap in collaborative multi-robot deep reinforcement learning. CoRR, abs/2008.07875, 2020.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Cite this chapter

Huang, X., Jin, G., Ruan, W. (2023). Deep Reinforcement Learning. In: Machine Learning Safety. Artificial Intelligence: Foundations, Theory, and Algorithms. Springer, Singapore. https://doi.org/10.1007/978-981-19-6814-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-6814-3_13

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-6813-6

  • Online ISBN: 978-981-19-6814-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics