Skip to main content

Robotic Arm Control and Task Training Through Deep Reinforcement Learning

  • Conference paper
  • First Online:
Intelligent Autonomous Systems 16 (IAS 2021)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 412))

Included in the following conference series:

Abstract

Deep Reinforcement Learning (DRL) is a promising Machine Learning technique that enables robotic systems to efficiently learn high dimensional control policies. However, generating good policies requires carefully define appropriate reward functions, state, and action spaces. There is no unique methodology to make these choices, and parameter tuning is time-consuming. In this paper, we investigate how the choice of both the reward function and hyper-parameters affects the quality of the policy learned. To this aim, we compare four DRL algorithms when learning continuous torque control policies for manipulation tasks via a model-free approach. In detail, we simulate one manipulator robot and formulate two tasks: a random target reaching and a pick&place application, each with two different reward functions. Then, we select the algorithms, multiple hyper-parameters, and exhaustively compare their learning performance across the two tasks. Finally, we include the simulated and real-world execution of our best policies. The obtained performance demonstrates the validity of our proposal. Users can follow our approach when selecting the best-performing algorithm according to the assignment. Moreover, they can exploit our results to solve the same tasks, even with other manipulator robots. Generated policies will be easily portable to a physical setup while guaranteeing a perfect match between the simulated and real behaviors.

A. Franceschetti and E. Tosello—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Developed MuJoCo model of a UR5 manipulator equipped with a Robotiq 3-finger gripper available at http://www.mujoco.org/forum/index.php?resources/universal-robots-ur5-robotiq-s-model-3-finger-gripper.22/.

  2. 2.

    Universal Robots UR5 specs available at https://www.universal-robots.com/products/ur5-robot/.

  3. 3.

    Robotiq 3-Finger adaptive gripper specs available at https://robotiq.com/products/3-finger-adaptive-robot-gripper.

  4. 4.

    Unified Robot Description Format (URDF) definition available at http://wiki.ros.org/urdf.

  5. 5.

    rllabplusplus available at https://github.com/shaneshixiang/rllabplusplus.

  6. 6.

    Video of performed experiments available at https://youtu.be/W1EMChcjkKA.

References

  1. Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)

    MATH  Google Scholar 

  2. Schulman, J., Moritz, P., Levine, S., Jordan, M.I., Abbeel. P.: High-dimensional continuous control using generalized advantage estimation. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, Conference Track Proceedings, San Juan, Puerto Rico, 2–4 May 2016 (2016)

    Google Scholar 

  3. Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, T., Riedmiller, M.: Deterministic policy gradient algorithms. In: Proceedings of the 31st International Conference on Machine Learning, ICML-14, pp. 387–395 (2014)

    Google Scholar 

  4. Gu, S., Lillicrap, T., Sutskever, I., Levine, S.: Continuous deep q-learning with model-based acceleration. In: International Conference on Machine Learning, pp. 2829–2838 (2016)

    Google Scholar 

  5. Varin, P., Grossman, L., Kuindersma, S.: A comparison of action spaces for learning manipulation tasks. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6015–6021 (2019)

    Google Scholar 

  6. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR, abs/1707.06347 (2017)

    Google Scholar 

  7. Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, vol. 80 of Proceedings of Machine Learning Research, 10–15 July 2018, pp 1861–1870. PMLR (2018)

    Google Scholar 

  8. Ceola, F., Tosello, E., Tagliapietra, L., Nicola, G., Ghidoni, S.: Robot task planning via deep reinforcement learning: a tabletop object sorting application. In: 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 486–492 (2019)

    Google Scholar 

  9. Nicola, G., Tagliapietra, L., Tosello, E., Navarin, N., Ghidoni, S., Menegatti, E.: Robotic object sorting via deep reinforcement learning: a generalized approach. In: 2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), pp. 1266–1273 (2020)

    Google Scholar 

  10. Thrun, S.: Explanation-Based Neural Network Learning: A Lifelong Learning Approach. Kluwer Academic Publishers, Norwell (1996)

    Book  Google Scholar 

  11. Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3389–3396. IEEE (2017)

    Google Scholar 

  12. Andrychowicz, M., et al.: Hindsight experience replay. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 5055–5065 (2017)

    Google Scholar 

  13. Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)

    MathSciNet  MATH  Google Scholar 

  14. Chebotar, Y., Hausman, K., Zhang, M., Sukhatme, G., Schaal, S., Levine, S.: Combining model-based and model-free updates for trajectory-centric reinforcement learning. arXiv preprint arXiv:1703.03078 (2017)

  15. Levine, S., Koltun, V.: Guided policy search. In: Proceedings of the 30th International Conference on International Conference on Machine Learning, ICML 2013, vol. 28, pp. III-1–III-9. JMLR.org (2013)

    Google Scholar 

  16. Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley (2014)

    Google Scholar 

  17. Kober J., Peters J.: Reinforcement learning in robotics: a survey. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning. Adaptation, Learning, and Optimization, vol. 12. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27645-3_18

  18. Kimura, H., Kobayashi, S.: Reinforcement learning for continuous action using stochastic gradient ascent. In: 5th Intelligent Autonomous Systems, pp. 288–295 (1998)

    Google Scholar 

  19. Duan, Y., Chen, X., Houthooft, R., Schulman, J., Abbeel, P.: Benchmarking deep reinforcement learning for continuous control. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, ICML 2016, vol. 48, pp. 1329–1338 (2016)

    Google Scholar 

  20. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR, abs/1412.6980 (2014)

    Google Scholar 

  21. Islam, R., Henderson, P., Gomrokchi, M., Precup, D.: Reproducibility of benchmarked deep reinforcement learning tasks for continuous control. arXiv preprint arXiv:1708.04133 (2017)

  22. Quigley, M., et al.: ROS: an open-source robot operating system. In: ICRA Workshop on Open Source Software, Kobe, Japan, vol. 3, p. 5 (2009)

    Google Scholar 

  23. Olson, E.: AprilTag: a robust and flexible visual fiducial system. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 3400–3407. IEEE (2011)

    Google Scholar 

  24. Castaman, N., et al.: RUR53: an unmanned ground vehicle for navigation, recognition, and manipulation. Adv. Robot. 35(1), 1–18 (2021)

    Article  Google Scholar 

  25. de Freitas, E.P., et al.: Ontological concepts for information sharing in cloud robotics. J. Ambient Intell. Humaniz. Comput. (2020)

    Google Scholar 

  26. Tosello, E., Fan, Z., Castro, A.G., Pagello, E.: Cloud-based task planning for smart robots. In: Chen, W., Hosoda, K., Menegatti, E., Shimizu, M., Wang, H. (eds.) IAS 2016. AISC, vol. 531, pp. 285–300. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-48036-7_21

    Chapter  Google Scholar 

  27. Tosello, E., Fan, Z., Pagello, E.: A semantic knowledge base for cognitive robotics manipulator. In: Workshop on Toward Intelligent Social Robots - Current Advances in Cognitive Robotics (2015)

    Google Scholar 

  28. Fan, Z., Tosello, E., Palmia, M., Pagello, E.: Applying semantic web technologies to multi-robot coordination. In: Workshop on New Research Frontiers for Intelligent Autonomous Systems, NRF-IAS-2014 (2014)

    Google Scholar 

Download references

Acknowledgments

Part of this work was supported by MIUR (Italian Minister for Education), under the initiative Departments of Excellence (Law 232/2016), and by Fondazione Cariverona, under the project Collaborazione Uomo-Robot per Assemblaggi Manuali Intelligenti (CURAMI).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elisa Tosello .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Franceschetti, A., Tosello, E., Castaman, N., Ghidoni, S. (2022). Robotic Arm Control and Task Training Through Deep Reinforcement Learning. In: Ang Jr, M.H., Asama, H., Lin, W., Foong, S. (eds) Intelligent Autonomous Systems 16. IAS 2021. Lecture Notes in Networks and Systems, vol 412. Springer, Cham. https://doi.org/10.1007/978-3-030-95892-3_41

Download citation

Publish with us

Policies and ethics