Skip to main content

CCA-MTFCN: A Robotic Pushing-Grasping Collaborative Method Based on Deep Reinforcement Learning

  • Conference paper
  • First Online:
Cognitive Systems and Information Processing (ICCSIP 2023)

Abstract

Robotic grasping in dense clutter is often infeasible because of the occlusion and stacking of objects. Directly grasping the stacked objects may cause collisions and result in low efficiency and high failure rates. In practice, the lateral robotic push can separate stacked objects to create collision-free grasp affordances. Inspired by this, we devise a method called CCA-MTFCN based on deep reinforcement learning, which can learn the synergies between pushing and grasping policies to complete the task of removing all objects in a heavily cluttered environment. Specifically, a hard parameter-sharing Multi-Task Fully Convolutional Network (MTFCN) is proposed to model the action-value function, then multi-scale feature fusion mechanism is implemented in it which can enhance visual perception capability in cluttered environments. Moreover, a new reward function based on connected component analysis (CCA) is designed to effectively evaluate the quality of push actions in pushing-and-grasping collaboration. This enables us to explicitly encourage pushing actions that aid grasping thus improving the efficiency of sequential decision-making. Our approach was trained in simulation through trial-and-error, and evaluation experiments for object removal tasks in dense clutter demonstrate that our proposed method outperforms several baseline approaches in terms of task completion rate, grasp success rate, and action efficiency, which also has the capability to generalize to new scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Zeng, A., Song, S., Yu, K.-T., et al.: Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching. Int. J. Robot. Res. 41(7), 690–705 (2022)

    Article  Google Scholar 

  2. Du, G., Wang, K., Lian, S., et al.: Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review. Artif. Intell. Rev. 54(3), 1677–1734 (2021)

    Article  Google Scholar 

  3. Mohammed, M.Q., Kwek, L.C., Chua, S.C., et al.: Review of learning-based robotic manipulation in cluttered environments. Sensors 22(20), 7938 (2022)

    Article  Google Scholar 

  4. Dogar, M.R., Srinivasa, S.S.: A planning framework for non-prehensile manipulation under clutter and uncertainty. Auton. Robot. 33, 217–236 (2012)

    Article  Google Scholar 

  5. Zeng, A., Song, S., Welker, S., et al.: Learning synergies between pushing and grasping with self-supervised deep reinforcement learning. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4238–4245. IEEE (2018)

    Google Scholar 

  6. Yang, Z., Shang, H.: Robotic pushing and grasping knowledge learning via attention deep Q-learning network. In: Li, G., Shen, H., Yuan, Y., Wang, X., Liu, H., Zhao, X. (eds.) Knowledge Science, Engineering and Management. KSEM 2020. LNCS, vol. 12274. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-55130-8_20

  7. Zhao, M., Zuo, G., Huang, G.: Collaborative learning of deep reinforcement pushing and grasping based on coordinate attention in clutter. In: 2022 International Conference on Virtual Reality, Human-Computer Interaction and Artificial Intelligence (VRHCIAI), pp. 156–161. IEEE (2022)

    Google Scholar 

  8. Sarantopoulos, I., Kiatos, M., Doulgeri, Z., et al.: Split deep Q-learning for robust object singulation. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 6225–6231. IEEE (2020)

    Google Scholar 

  9. Crawshaw, M.: Multi-task learning with deep neural networks: a survey. arXiv preprint arXiv:2009.09796 (2020)

  10. Chen, L.C., Zhu, Y., Papandreou, G., et al.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. ECCV 2018. LNCS, vol. 11211. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49

  11. Kleeberger, K., Bormann, R., Kraus, W., et al.: A survey on learning-based robotic grasping. Curr. Robot. Rep. 1, 239–249 (2020)

    Article  Google Scholar 

  12. Sahbani, A., El-Khoury, S., Bidaud, P.: An overview of 3D object grasp synthesis algorithms. Robot. Auton. Syst. 60(3), 326–336 (2012)

    Article  Google Scholar 

  13. Dong, Z., Liu, S., Zhou, T., et al.: PPR-Net: point-wise pose regression network for instance segmentation and 6D pose estimation in bin-picking scenarios. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1773–1780. IEEE (2019)

    Google Scholar 

  14. Qi, C.R., Yi, L., Su, H., et al.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems 30 (2017)

    Google Scholar 

  15. Li, H., Qu, X., Ye, B.: Six-degree-of-freedom robot grasping based on three-dimensional point cloud features of unknown objects. Control Theory Appl. 39(06), 1103–1111 (2022)

    Google Scholar 

  16. Kumra, S., Joshi, S., Sahin, F.: Antipodal robotic grasping using generative residual convolutional neural network. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 9626–9633. IEEE (2020)

    Google Scholar 

  17. Wang, S., Zhou, Z., Kan, Z.: When transformer meets robotic grasping: exploits context for efficient grasp detection. IEEE Robot. Autom. Lett. 7(3), 8170–8177 (2022)

    Article  Google Scholar 

  18. Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  19. Kalashnikov, D., Irpan, A., Pastor, P., et al.: QT-Opt: scalable deep reinforcement learning for vision-based robotic manipulation. arXiv preprint arXiv:1806.10293 (2018)

  20. Quillen, D., Jang, E., Nachum, O., et al.: Deep reinforcement learning for vision-based robotic grasping: a simulated comparative evaluation of off-policy methods. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 6284–6291. IEEE (2018)

    Google Scholar 

  21. Breyer, M., Furrer, F., Novkovic, T., et al.: Comparing task simplifications to learn closed-loop object picking using deep reinforcement learning. IEEE Robot. Autom. Lett. 4(2), 1549–1556 (2019)

    Article  Google Scholar 

  22. Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)

  23. Gupta, M., Sukhatme, G.S.: Using manipulation primitives for brick sorting in clutter. In: 2012 IEEE International Conference on Robotics and Automation, pp. 3883–3889. IEEE (2012)

    Google Scholar 

  24. Liang, H., Lou, X., Yang, Y., et al.: Learning visual affordances with target-orientated deep Q-network to grasp objects by harnessing environmental fixtures. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 2562–2568. IEEE (2021)

    Google Scholar 

  25. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

  26. Huang, G., Liu, Z., Van Der Maaten L., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

    Google Scholar 

  27. Sifre, L., Mallat S.: Rigid-motion scattering for texture classification. arXiv preprint arXiv:1403.1687 (2014)

  28. Rohmer, E., Singh, S.P.N., Freese, M.: V-REP: a versatile and scalable robot simulation framework. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1321–1326. IEEE (2013)

    Google Scholar 

  29. Schaul, T., Quan, J., Antonoglou, I., et al.: Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015)

Download references

Acknowledgments

This work was supported by the National Key R&D Program of China (grant No.: 2022YFB4700400), National Natural Science Foundation of China (grant No.: 62073249).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huasong Min .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xu, H., Wang, Q., Min, H. (2024). CCA-MTFCN: A Robotic Pushing-Grasping Collaborative Method Based on Deep Reinforcement Learning. In: Sun, F., Meng, Q., Fu, Z., Fang, B. (eds) Cognitive Systems and Information Processing. ICCSIP 2023. Communications in Computer and Information Science, vol 1918. Springer, Singapore. https://doi.org/10.1007/978-981-99-8018-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8018-5_5

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8017-8

  • Online ISBN: 978-981-99-8018-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics