Abstract
Bimanual activities like coffee stirring, which require coordination of dual arms, are common in daily life and intractable to learn by robots. Adopting reinforcement learning to learn these tasks is a promising topic since it enables the robot to explore how dual arms coordinate together to accomplish the same task. However, this field has two main challenges: coordination mechanism and long-horizon task decomposition. Therefore, we propose the Mixline method to learn sub-tasks separately via the online algorithm and then compose them together based on the generated data through the offline algorithm. We constructed a learning environment based on the GPU-accelerated Isaac Gym. In our work, the bimanual robot successfully learned to grasp, hold and lift the spoon and cup, insert them together and stir the coffee. The proposed method has the potential to be extended to other long-horizon bimanual tasks.
Z. Sun and Z. Wang—Contribute equally to this work.
This work was supported in part by the Research Grants Council of the Hong Kong Special Administrative Region, China under Grant 24209021, in part by the VC Fund of the CUHK T Stone Robotics Institute under Grant 4930745, in part by CUHK Direct Grant for Research under Grant 4055140, and in part by the Hong Kong Centre for Logistics Robotics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: Deep reinforcement learning: a brief survey. IEEE Sign. Process. Mag. 34(6), 26–38 (2017). https://doi.org/10.1109/MSP.2017.2743240
Ding, Z., Huang, Y., Yuan, H., Dong, H.: Introduction to reinforcement learning. In: Dong, H., Ding, Z., Zhang, S. (eds.) Deep Reinforcement Learning, pp. 47–123. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-4095-0_2
Zhang, M., Jian, P., Wu, Y., Xu, H., Wang, X.: Disentangled attention as intrinsic regularization for bimanual multi-object manipulation. arXiv e-prints, pp. arXiv-2106 (2021)
Chitnis, R., Tulsiani, S., Gupta, S., Gupta, A.: Efficient bimanual manipulation using learned task schemas. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 1149–1155. IEEE (2020)
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Andrychowicz, M., et al.: Hindsight experience replay. Adv. Neural Inf. Process. Syst. 30 (2017)
Liu, L., Liu, Q., Song, Y., Pang, B., Yuan, X., Xu, Q.: A collaborative control method of dual-arm robots based on deep reinforcement learning. Appl. Sci. 11(4), 1816 (2021)
Chiu, Z.Y., Richter, F., Funk, E.K., Orosco, R.K., Yip, M.C.: Bimanual regrasping for suture needles using reinforcement learning for rapid motion planning. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 7737–7743. IEEE (2021)
Rajeswaran, A., et al.: Learning complex dexterous manipulation with deep reinforcement learning and demonstrations. arXiv preprint arXiv:1709.10087 (2017)
Liu, J., Zhang, H., Fu, Z., Wang, Y.: Learning scalable multi-agent coordination by spatial differentiation for traffic signal control. Eng. Appl. Artif. Intell. 100, 104165 (2021)
Cabi, S., et al.: Scaling data-driven robotics with reward sketching and batch reinforcement learning. arXiv preprint arXiv:1909.12200 (2019)
Mandlekar, A., Xu, D., Martín-Martín, R., Savarese, S., Fei-Fei, L.: Learning to generalize across long-horizon tasks from human demonstrations. arXiv preprint arXiv:2003.06085 (2020)
Zhang, K., Yang, Z., Başar, T.: Multi-agent reinforcement learning: a selective overview of theories and algorithms. Handbook of Reinforcement Learning and Control, pp. 321–384 (2021)
Liu, J., et al.: Robot cooking with stir-fry: bimanual non-prehensile manipulation of semi-fluid objects. IEEE Robot. Autom. Lett. 7(2), 5159–5166 (2022)
Dong, Z., Li, Z., Yan, Y., Calinon, S., Chen, F.: Passive bimanual skills learning from demonstration with motion graph attention networks. IEEE Robot. Autom. Lett. 7(2), 4917–4923 (2022). https://doi.org/10.1109/LRA.2022.3152974
Makoviychuk, V., et al.: Isaac Gym: high performance GPU-based physics simulation for robot learning (2021)
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’ Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
Wan, W., Harada, K.: Developing and comparing single-arm and dual-arm regrasp. IEEE Robot. Autom. Lett. 1(1), 243–250 (2016)
Sutton, R.S., Barto, A.G., et al.: Introduction to reinforcement learning (1998)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897. PMLR (2015)
Kumar, A., Zhou, A., Tucker, G., Levine, S.: Conservative q-learning for offline reinforcement learning. Adv. Neural. Inf. Process. Syst. 33, 1179–1191 (2020)
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International Conference on Machine Learning (ICML) (2018)
Liu, J., et al.: Efficient reinforcement learning control for continuum robots based on inexplicit prior knowledge. arXiv preprint arXiv:2002.11573 (2020)
Yadan, O.: Hydra - a framework for elegantly configuring complex applications. Github (2019). https://github.com/facebookresearch/hydra
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sun, Z., Wang, Z., Liu, J., Li, M., Chen, F. (2022). Mixline: A Hybrid Reinforcement Learning Framework for Long-Horizon Bimanual Coffee Stirring Task. In: Liu, H., et al. Intelligent Robotics and Applications. ICIRA 2022. Lecture Notes in Computer Science(), vol 13455. Springer, Cham. https://doi.org/10.1007/978-3-031-13844-7_58
Download citation
DOI: https://doi.org/10.1007/978-3-031-13844-7_58
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-13843-0
Online ISBN: 978-3-031-13844-7
eBook Packages: Computer ScienceComputer Science (R0)