CCA-MTFCN: A Robotic Pushing-Grasping Collaborative Method Based on Deep Reinforcement Learning

Xu, Haiyuan; Wang, Qi; Min, Huasong

doi:10.1007/978-981-99-8018-5_5

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1918))

Included in the following conference series:

International Conference on Cognitive Systems and Signal Processing

209 Accesses

Abstract

Robotic grasping in dense clutter is often infeasible because of the occlusion and stacking of objects. Directly grasping the stacked objects may cause collisions and result in low efficiency and high failure rates. In practice, the lateral robotic push can separate stacked objects to create collision-free grasp affordances. Inspired by this, we devise a method called CCA-MTFCN based on deep reinforcement learning, which can learn the synergies between pushing and grasping policies to complete the task of removing all objects in a heavily cluttered environment. Specifically, a hard parameter-sharing Multi-Task Fully Convolutional Network (MTFCN) is proposed to model the action-value function, then multi-scale feature fusion mechanism is implemented in it which can enhance visual perception capability in cluttered environments. Moreover, a new reward function based on connected component analysis (CCA) is designed to effectively evaluate the quality of push actions in pushing-and-grasping collaboration. This enables us to explicitly encourage pushing actions that aid grasping thus improving the efficiency of sequential decision-making. Our approach was trained in simulation through trial-and-error, and evaluation experiments for object removal tasks in dense clutter demonstrate that our proposed method outperforms several baseline approaches in terms of task completion rate, grasp success rate, and action efficiency, which also has the capability to generalize to new scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Zeng, A., Song, S., Yu, K.-T., et al.: Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching. Int. J. Robot. Res. 41(7), 690–705 (2022)
Article Google Scholar
Du, G., Wang, K., Lian, S., et al.: Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review. Artif. Intell. Rev. 54(3), 1677–1734 (2021)
Article Google Scholar
Mohammed, M.Q., Kwek, L.C., Chua, S.C., et al.: Review of learning-based robotic manipulation in cluttered environments. Sensors 22(20), 7938 (2022)
Article Google Scholar
Dogar, M.R., Srinivasa, S.S.: A planning framework for non-prehensile manipulation under clutter and uncertainty. Auton. Robot. 33, 217–236 (2012)
Article Google Scholar
Zeng, A., Song, S., Welker, S., et al.: Learning synergies between pushing and grasping with self-supervised deep reinforcement learning. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4238–4245. IEEE (2018)
Google Scholar
Yang, Z., Shang, H.: Robotic pushing and grasping knowledge learning via attention deep Q-learning network. In: Li, G., Shen, H., Yuan, Y., Wang, X., Liu, H., Zhao, X. (eds.) Knowledge Science, Engineering and Management. KSEM 2020. LNCS, vol. 12274. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-55130-8_20
Zhao, M., Zuo, G., Huang, G.: Collaborative learning of deep reinforcement pushing and grasping based on coordinate attention in clutter. In: 2022 International Conference on Virtual Reality, Human-Computer Interaction and Artificial Intelligence (VRHCIAI), pp. 156–161. IEEE (2022)
Google Scholar
Sarantopoulos, I., Kiatos, M., Doulgeri, Z., et al.: Split deep Q-learning for robust object singulation. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 6225–6231. IEEE (2020)
Google Scholar
Crawshaw, M.: Multi-task learning with deep neural networks: a survey. arXiv preprint arXiv:2009.09796 (2020)
Chen, L.C., Zhu, Y., Papandreou, G., et al.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. ECCV 2018. LNCS, vol. 11211. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49
Kleeberger, K., Bormann, R., Kraus, W., et al.: A survey on learning-based robotic grasping. Curr. Robot. Rep. 1, 239–249 (2020)
Article Google Scholar
Sahbani, A., El-Khoury, S., Bidaud, P.: An overview of 3D object grasp synthesis algorithms. Robot. Auton. Syst. 60(3), 326–336 (2012)
Article Google Scholar
Dong, Z., Liu, S., Zhou, T., et al.: PPR-Net: point-wise pose regression network for instance segmentation and 6D pose estimation in bin-picking scenarios. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1773–1780. IEEE (2019)
Google Scholar
Qi, C.R., Yi, L., Su, H., et al.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems 30 (2017)
Google Scholar
Li, H., Qu, X., Ye, B.: Six-degree-of-freedom robot grasping based on three-dimensional point cloud features of unknown objects. Control Theory Appl. 39(06), 1103–1111 (2022)
Google Scholar
Kumra, S., Joshi, S., Sahin, F.: Antipodal robotic grasping using generative residual convolutional neural network. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 9626–9633. IEEE (2020)
Google Scholar
Wang, S., Zhou, Z., Kan, Z.: When transformer meets robotic grasping: exploits context for efficient grasp detection. IEEE Robot. Autom. Lett. 7(3), 8170–8177 (2022)
Article Google Scholar
Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Kalashnikov, D., Irpan, A., Pastor, P., et al.: QT-Opt: scalable deep reinforcement learning for vision-based robotic manipulation. arXiv preprint arXiv:1806.10293 (2018)
Quillen, D., Jang, E., Nachum, O., et al.: Deep reinforcement learning for vision-based robotic grasping: a simulated comparative evaluation of off-policy methods. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 6284–6291. IEEE (2018)
Google Scholar
Breyer, M., Furrer, F., Novkovic, T., et al.: Comparing task simplifications to learn closed-loop object picking using deep reinforcement learning. IEEE Robot. Autom. Lett. 4(2), 1549–1556 (2019)
Article Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Gupta, M., Sukhatme, G.S.: Using manipulation primitives for brick sorting in clutter. In: 2012 IEEE International Conference on Robotics and Automation, pp. 3883–3889. IEEE (2012)
Google Scholar
Liang, H., Lou, X., Yang, Y., et al.: Learning visual affordances with target-orientated deep Q-network to grasp objects by harnessing environmental fixtures. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 2562–2568. IEEE (2021)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Huang, G., Liu, Z., Van Der Maaten L., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Google Scholar
Sifre, L., Mallat S.: Rigid-motion scattering for texture classification. arXiv preprint arXiv:1403.1687 (2014)
Rohmer, E., Singh, S.P.N., Freese, M.: V-REP: a versatile and scalable robot simulation framework. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1321–1326. IEEE (2013)
Google Scholar
Schaul, T., Quan, J., Antonoglou, I., et al.: Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015)

Download references

Acknowledgments

This work was supported by the National Key R&D Program of China (grant No.: 2022YFB4700400), National Natural Science Foundation of China (grant No.: 62073249).

Author information

Authors and Affiliations

Institute of Robotics and Intelligent Systems, Wuhan University of Science and Technology, Wuhan, 430081, China
Haiyuan Xu, Qi Wang & Huasong Min

Authors

Haiyuan Xu
View author publications
You can also search for this author in PubMed Google Scholar
Qi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Huasong Min
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huasong Min .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Fuchun Sun
Southern University of Science and Technology, Shenzhen, China
Qinghu Meng
Henan University of Science and Technology, Luoyang, China
Zhumu Fu
Tsinghua University, Beijing, China
Bin Fang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, H., Wang, Q., Min, H. (2024). CCA-MTFCN: A Robotic Pushing-Grasping Collaborative Method Based on Deep Reinforcement Learning. In: Sun, F., Meng, Q., Fu, Z., Fang, B. (eds) Cognitive Systems and Information Processing. ICCSIP 2023. Communications in Computer and Information Science, vol 1918. Springer, Singapore. https://doi.org/10.1007/978-981-99-8018-5_5

Download citation

DOI: https://doi.org/10.1007/978-981-99-8018-5_5
Published: 05 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8017-8
Online ISBN: 978-981-99-8018-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics