Skip to main content

Mixline: A Hybrid Reinforcement Learning Framework for Long-Horizon Bimanual Coffee Stirring Task

  • Conference paper
  • First Online:
Intelligent Robotics and Applications (ICIRA 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13455))

Included in the following conference series:

Abstract

Bimanual activities like coffee stirring, which require coordination of dual arms, are common in daily life and intractable to learn by robots. Adopting reinforcement learning to learn these tasks is a promising topic since it enables the robot to explore how dual arms coordinate together to accomplish the same task. However, this field has two main challenges: coordination mechanism and long-horizon task decomposition. Therefore, we propose the Mixline method to learn sub-tasks separately via the online algorithm and then compose them together based on the generated data through the offline algorithm. We constructed a learning environment based on the GPU-accelerated Isaac Gym. In our work, the bimanual robot successfully learned to grasp, hold and lift the spoon and cup, insert them together and stir the coffee. The proposed method has the potential to be extended to other long-horizon bimanual tasks.

Z. Sun and Z. Wang—Contribute equally to this work.

This work was supported in part by the Research Grants Council of the Hong Kong Special Administrative Region, China under Grant 24209021, in part by the VC Fund of the CUHK T Stone Robotics Institute under Grant 4930745, in part by CUHK Direct Grant for Research under Grant 4055140, and in part by the Hong Kong Centre for Logistics Robotics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: Deep reinforcement learning: a brief survey. IEEE Sign. Process. Mag. 34(6), 26–38 (2017). https://doi.org/10.1109/MSP.2017.2743240

    Article  Google Scholar 

  2. Ding, Z., Huang, Y., Yuan, H., Dong, H.: Introduction to reinforcement learning. In: Dong, H., Ding, Z., Zhang, S. (eds.) Deep Reinforcement Learning, pp. 47–123. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-4095-0_2

    Chapter  Google Scholar 

  3. Zhang, M., Jian, P., Wu, Y., Xu, H., Wang, X.: Disentangled attention as intrinsic regularization for bimanual multi-object manipulation. arXiv e-prints, pp. arXiv-2106 (2021)

    Google Scholar 

  4. Chitnis, R., Tulsiani, S., Gupta, S., Gupta, A.: Efficient bimanual manipulation using learned task schemas. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 1149–1155. IEEE (2020)

    Google Scholar 

  5. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)

  6. Andrychowicz, M., et al.: Hindsight experience replay. Adv. Neural Inf. Process. Syst. 30 (2017)

    Google Scholar 

  7. Liu, L., Liu, Q., Song, Y., Pang, B., Yuan, X., Xu, Q.: A collaborative control method of dual-arm robots based on deep reinforcement learning. Appl. Sci. 11(4), 1816 (2021)

    Article  Google Scholar 

  8. Chiu, Z.Y., Richter, F., Funk, E.K., Orosco, R.K., Yip, M.C.: Bimanual regrasping for suture needles using reinforcement learning for rapid motion planning. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 7737–7743. IEEE (2021)

    Google Scholar 

  9. Rajeswaran, A., et al.: Learning complex dexterous manipulation with deep reinforcement learning and demonstrations. arXiv preprint arXiv:1709.10087 (2017)

  10. Liu, J., Zhang, H., Fu, Z., Wang, Y.: Learning scalable multi-agent coordination by spatial differentiation for traffic signal control. Eng. Appl. Artif. Intell. 100, 104165 (2021)

    Article  Google Scholar 

  11. Cabi, S., et al.: Scaling data-driven robotics with reward sketching and batch reinforcement learning. arXiv preprint arXiv:1909.12200 (2019)

  12. Mandlekar, A., Xu, D., Martín-Martín, R., Savarese, S., Fei-Fei, L.: Learning to generalize across long-horizon tasks from human demonstrations. arXiv preprint arXiv:2003.06085 (2020)

  13. Zhang, K., Yang, Z., Başar, T.: Multi-agent reinforcement learning: a selective overview of theories and algorithms. Handbook of Reinforcement Learning and Control, pp. 321–384 (2021)

    Google Scholar 

  14. Liu, J., et al.: Robot cooking with stir-fry: bimanual non-prehensile manipulation of semi-fluid objects. IEEE Robot. Autom. Lett. 7(2), 5159–5166 (2022)

    Google Scholar 

  15. Dong, Z., Li, Z., Yan, Y., Calinon, S., Chen, F.: Passive bimanual skills learning from demonstration with motion graph attention networks. IEEE Robot. Autom. Lett. 7(2), 4917–4923 (2022). https://doi.org/10.1109/LRA.2022.3152974

    Article  Google Scholar 

  16. Makoviychuk, V., et al.: Isaac Gym: high performance GPU-based physics simulation for robot learning (2021)

    Google Scholar 

  17. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’ Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

  18. Wan, W., Harada, K.: Developing and comparing single-arm and dual-arm regrasp. IEEE Robot. Autom. Lett. 1(1), 243–250 (2016)

    Article  Google Scholar 

  19. Sutton, R.S., Barto, A.G., et al.: Introduction to reinforcement learning (1998)

    Google Scholar 

  20. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

  21. Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897. PMLR (2015)

    Google Scholar 

  22. Kumar, A., Zhou, A., Tucker, G., Levine, S.: Conservative q-learning for offline reinforcement learning. Adv. Neural. Inf. Process. Syst. 33, 1179–1191 (2020)

    Google Scholar 

  23. Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International Conference on Machine Learning (ICML) (2018)

    Google Scholar 

  24. Liu, J., et al.: Efficient reinforcement learning control for continuum robots based on inexplicit prior knowledge. arXiv preprint arXiv:2002.11573 (2020)

  25. Yadan, O.: Hydra - a framework for elegantly configuring complex applications. Github (2019). https://github.com/facebookresearch/hydra

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Miao Li or Fei Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sun, Z., Wang, Z., Liu, J., Li, M., Chen, F. (2022). Mixline: A Hybrid Reinforcement Learning Framework for Long-Horizon Bimanual Coffee Stirring Task. In: Liu, H., et al. Intelligent Robotics and Applications. ICIRA 2022. Lecture Notes in Computer Science(), vol 13455. Springer, Cham. https://doi.org/10.1007/978-3-031-13844-7_58

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-13844-7_58

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-13843-0

  • Online ISBN: 978-3-031-13844-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics