Robotic Arm Control and Task Training Through Deep Reinforcement Learning

Franceschetti, Andrea; Tosello, Elisa; Castaman, Nicola; Ghidoni, Stefano

doi:10.1007/978-3-030-95892-3_41

Andrea Franceschetti¹³,
Elisa Tosello¹³,
Nicola Castaman^13,14 &
…
Stefano Ghidoni¹³

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 412))

Included in the following conference series:

International Conference on Intelligent Autonomous Systems

1544 Accesses
4 Citations

Abstract

Deep Reinforcement Learning (DRL) is a promising Machine Learning technique that enables robotic systems to efficiently learn high dimensional control policies. However, generating good policies requires carefully define appropriate reward functions, state, and action spaces. There is no unique methodology to make these choices, and parameter tuning is time-consuming. In this paper, we investigate how the choice of both the reward function and hyper-parameters affects the quality of the policy learned. To this aim, we compare four DRL algorithms when learning continuous torque control policies for manipulation tasks via a model-free approach. In detail, we simulate one manipulator robot and formulate two tasks: a random target reaching and a pick&place application, each with two different reward functions. Then, we select the algorithms, multiple hyper-parameters, and exhaustively compare their learning performance across the two tasks. Finally, we include the simulated and real-world execution of our best policies. The obtained performance demonstrates the validity of our proposal. Users can follow our approach when selecting the best-performing algorithm according to the assignment. Moreover, they can exploit our results to solve the same tasks, even with other manipulator robots. Generated policies will be easily portable to a physical setup while guaranteeing a perfect match between the simulated and real behaviors.

A. Franceschetti and E. Tosello—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Developed MuJoCo model of a UR5 manipulator equipped with a Robotiq 3-finger gripper available at http://www.mujoco.org/forum/index.php?resources/universal-robots-ur5-robotiq-s-model-3-finger-gripper.22/.
2.
Universal Robots UR5 specs available at https://www.universal-robots.com/products/ur5-robot/.
3.
Robotiq 3-Finger adaptive gripper specs available at https://robotiq.com/products/3-finger-adaptive-robot-gripper.
4.
Unified Robot Description Format (URDF) definition available at http://wiki.ros.org/urdf.
5.
rllabplusplus available at https://github.com/shaneshixiang/rllabplusplus.
6.
Video of performed experiments available at https://youtu.be/W1EMChcjkKA.

References

Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)
MATH Google Scholar
Schulman, J., Moritz, P., Levine, S., Jordan, M.I., Abbeel. P.: High-dimensional continuous control using generalized advantage estimation. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, Conference Track Proceedings, San Juan, Puerto Rico, 2–4 May 2016 (2016)
Google Scholar
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, T., Riedmiller, M.: Deterministic policy gradient algorithms. In: Proceedings of the 31st International Conference on Machine Learning, ICML-14, pp. 387–395 (2014)
Google Scholar
Gu, S., Lillicrap, T., Sutskever, I., Levine, S.: Continuous deep q-learning with model-based acceleration. In: International Conference on Machine Learning, pp. 2829–2838 (2016)
Google Scholar
Varin, P., Grossman, L., Kuindersma, S.: A comparison of action spaces for learning manipulation tasks. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6015–6021 (2019)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR, abs/1707.06347 (2017)
Google Scholar
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, vol. 80 of Proceedings of Machine Learning Research, 10–15 July 2018, pp 1861–1870. PMLR (2018)
Google Scholar
Ceola, F., Tosello, E., Tagliapietra, L., Nicola, G., Ghidoni, S.: Robot task planning via deep reinforcement learning: a tabletop object sorting application. In: 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 486–492 (2019)
Google Scholar
Nicola, G., Tagliapietra, L., Tosello, E., Navarin, N., Ghidoni, S., Menegatti, E.: Robotic object sorting via deep reinforcement learning: a generalized approach. In: 2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), pp. 1266–1273 (2020)
Google Scholar
Thrun, S.: Explanation-Based Neural Network Learning: A Lifelong Learning Approach. Kluwer Academic Publishers, Norwell (1996)
Book Google Scholar
Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3389–3396. IEEE (2017)
Google Scholar
Andrychowicz, M., et al.: Hindsight experience replay. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 5055–5065 (2017)
Google Scholar
Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)
MathSciNet MATH Google Scholar
Chebotar, Y., Hausman, K., Zhang, M., Sukhatme, G., Schaal, S., Levine, S.: Combining model-based and model-free updates for trajectory-centric reinforcement learning. arXiv preprint arXiv:1703.03078 (2017)
Levine, S., Koltun, V.: Guided policy search. In: Proceedings of the 30th International Conference on International Conference on Machine Learning, ICML 2013, vol. 28, pp. III-1–III-9. JMLR.org (2013)
Google Scholar
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley (2014)
Google Scholar
Kober J., Peters J.: Reinforcement learning in robotics: a survey. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning. Adaptation, Learning, and Optimization, vol. 12. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27645-3_18
Kimura, H., Kobayashi, S.: Reinforcement learning for continuous action using stochastic gradient ascent. In: 5th Intelligent Autonomous Systems, pp. 288–295 (1998)
Google Scholar
Duan, Y., Chen, X., Houthooft, R., Schulman, J., Abbeel, P.: Benchmarking deep reinforcement learning for continuous control. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, ICML 2016, vol. 48, pp. 1329–1338 (2016)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR, abs/1412.6980 (2014)
Google Scholar
Islam, R., Henderson, P., Gomrokchi, M., Precup, D.: Reproducibility of benchmarked deep reinforcement learning tasks for continuous control. arXiv preprint arXiv:1708.04133 (2017)
Quigley, M., et al.: ROS: an open-source robot operating system. In: ICRA Workshop on Open Source Software, Kobe, Japan, vol. 3, p. 5 (2009)
Google Scholar
Olson, E.: AprilTag: a robust and flexible visual fiducial system. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 3400–3407. IEEE (2011)
Google Scholar
Castaman, N., et al.: RUR53: an unmanned ground vehicle for navigation, recognition, and manipulation. Adv. Robot. 35(1), 1–18 (2021)
Article Google Scholar
de Freitas, E.P., et al.: Ontological concepts for information sharing in cloud robotics. J. Ambient Intell. Humaniz. Comput. (2020)
Google Scholar
Tosello, E., Fan, Z., Castro, A.G., Pagello, E.: Cloud-based task planning for smart robots. In: Chen, W., Hosoda, K., Menegatti, E., Shimizu, M., Wang, H. (eds.) IAS 2016. AISC, vol. 531, pp. 285–300. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-48036-7_21
Chapter Google Scholar
Tosello, E., Fan, Z., Pagello, E.: A semantic knowledge base for cognitive robotics manipulator. In: Workshop on Toward Intelligent Social Robots - Current Advances in Cognitive Robotics (2015)
Google Scholar
Fan, Z., Tosello, E., Palmia, M., Pagello, E.: Applying semantic web technologies to multi-robot coordination. In: Workshop on New Research Frontiers for Intelligent Autonomous Systems, NRF-IAS-2014 (2014)
Google Scholar

Download references

Acknowledgments

Part of this work was supported by MIUR (Italian Minister for Education), under the initiative Departments of Excellence (Law 232/2016), and by Fondazione Cariverona, under the project Collaborazione Uomo-Robot per Assemblaggi Manuali Intelligenti (CURAMI).

Author information

Authors and Affiliations

Intelligent Autonomous Systems Laboratory (IAS-Lab), Department of Information Engineering, University of Padova, Via Gradenigo 6/B, 35131, Padova, Italy
Andrea Franceschetti, Elisa Tosello, Nicola Castaman & Stefano Ghidoni
IT+Robotics srl, Contrà Valmerlara 21, 36100, Vicenza, Italy
Nicola Castaman

Authors

Andrea Franceschetti
View author publications
You can also search for this author in PubMed Google Scholar
Elisa Tosello
View author publications
You can also search for this author in PubMed Google Scholar
Nicola Castaman
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Ghidoni
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elisa Tosello .

Editor information

Editors and Affiliations

Department of Mechanical Engineering, National University of Singapore, Singapore, Singapore
Marcelo H. Ang Jr
Department of Precision Engineering, University of Tokyo, Tokyo, Japan
Hajime Asama
Singapore Institute of Manufacturing Technology, Singapore, Singapore
Wei Lin
Engineering Product Development, Singapore University of Technology and Design, Singapore, Singapore
Shaohui Foong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Franceschetti, A., Tosello, E., Castaman, N., Ghidoni, S. (2022). Robotic Arm Control and Task Training Through Deep Reinforcement Learning. In: Ang Jr, M.H., Asama, H., Lin, W., Foong, S. (eds) Intelligent Autonomous Systems 16. IAS 2021. Lecture Notes in Networks and Systems, vol 412. Springer, Cham. https://doi.org/10.1007/978-3-030-95892-3_41

Download citation

DOI: https://doi.org/10.1007/978-3-030-95892-3_41
Published: 08 April 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95891-6
Online ISBN: 978-3-030-95892-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Robotic Arm Control and Task Training Through Deep Reinforcement Learning