Abstract
Robotic grasping in dense clutter is often infeasible because of the occlusion and stacking of objects. Directly grasping the stacked objects may cause collisions and result in low efficiency and high failure rates. In practice, the lateral robotic push can separate stacked objects to create collision-free grasp affordances. Inspired by this, we devise a method called CCA-MTFCN based on deep reinforcement learning, which can learn the synergies between pushing and grasping policies to complete the task of removing all objects in a heavily cluttered environment. Specifically, a hard parameter-sharing Multi-Task Fully Convolutional Network (MTFCN) is proposed to model the action-value function, then multi-scale feature fusion mechanism is implemented in it which can enhance visual perception capability in cluttered environments. Moreover, a new reward function based on connected component analysis (CCA) is designed to effectively evaluate the quality of push actions in pushing-and-grasping collaboration. This enables us to explicitly encourage pushing actions that aid grasping thus improving the efficiency of sequential decision-making. Our approach was trained in simulation through trial-and-error, and evaluation experiments for object removal tasks in dense clutter demonstrate that our proposed method outperforms several baseline approaches in terms of task completion rate, grasp success rate, and action efficiency, which also has the capability to generalize to new scenarios.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zeng, A., Song, S., Yu, K.-T., et al.: Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching. Int. J. Robot. Res. 41(7), 690–705 (2022)
Du, G., Wang, K., Lian, S., et al.: Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review. Artif. Intell. Rev. 54(3), 1677–1734 (2021)
Mohammed, M.Q., Kwek, L.C., Chua, S.C., et al.: Review of learning-based robotic manipulation in cluttered environments. Sensors 22(20), 7938 (2022)
Dogar, M.R., Srinivasa, S.S.: A planning framework for non-prehensile manipulation under clutter and uncertainty. Auton. Robot. 33, 217–236 (2012)
Zeng, A., Song, S., Welker, S., et al.: Learning synergies between pushing and grasping with self-supervised deep reinforcement learning. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4238–4245. IEEE (2018)
Yang, Z., Shang, H.: Robotic pushing and grasping knowledge learning via attention deep Q-learning network. In: Li, G., Shen, H., Yuan, Y., Wang, X., Liu, H., Zhao, X. (eds.) Knowledge Science, Engineering and Management. KSEM 2020. LNCS, vol. 12274. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-55130-8_20
Zhao, M., Zuo, G., Huang, G.: Collaborative learning of deep reinforcement pushing and grasping based on coordinate attention in clutter. In: 2022 International Conference on Virtual Reality, Human-Computer Interaction and Artificial Intelligence (VRHCIAI), pp. 156–161. IEEE (2022)
Sarantopoulos, I., Kiatos, M., Doulgeri, Z., et al.: Split deep Q-learning for robust object singulation. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 6225–6231. IEEE (2020)
Crawshaw, M.: Multi-task learning with deep neural networks: a survey. arXiv preprint arXiv:2009.09796 (2020)
Chen, L.C., Zhu, Y., Papandreou, G., et al.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. ECCV 2018. LNCS, vol. 11211. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49
Kleeberger, K., Bormann, R., Kraus, W., et al.: A survey on learning-based robotic grasping. Curr. Robot. Rep. 1, 239–249 (2020)
Sahbani, A., El-Khoury, S., Bidaud, P.: An overview of 3D object grasp synthesis algorithms. Robot. Auton. Syst. 60(3), 326–336 (2012)
Dong, Z., Liu, S., Zhou, T., et al.: PPR-Net: point-wise pose regression network for instance segmentation and 6D pose estimation in bin-picking scenarios. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1773–1780. IEEE (2019)
Qi, C.R., Yi, L., Su, H., et al.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems 30 (2017)
Li, H., Qu, X., Ye, B.: Six-degree-of-freedom robot grasping based on three-dimensional point cloud features of unknown objects. Control Theory Appl. 39(06), 1103–1111 (2022)
Kumra, S., Joshi, S., Sahin, F.: Antipodal robotic grasping using generative residual convolutional neural network. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 9626–9633. IEEE (2020)
Wang, S., Zhou, Z., Kan, Z.: When transformer meets robotic grasping: exploits context for efficient grasp detection. IEEE Robot. Autom. Lett. 7(3), 8170–8177 (2022)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Kalashnikov, D., Irpan, A., Pastor, P., et al.: QT-Opt: scalable deep reinforcement learning for vision-based robotic manipulation. arXiv preprint arXiv:1806.10293 (2018)
Quillen, D., Jang, E., Nachum, O., et al.: Deep reinforcement learning for vision-based robotic grasping: a simulated comparative evaluation of off-policy methods. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 6284–6291. IEEE (2018)
Breyer, M., Furrer, F., Novkovic, T., et al.: Comparing task simplifications to learn closed-loop object picking using deep reinforcement learning. IEEE Robot. Autom. Lett. 4(2), 1549–1556 (2019)
Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Gupta, M., Sukhatme, G.S.: Using manipulation primitives for brick sorting in clutter. In: 2012 IEEE International Conference on Robotics and Automation, pp. 3883–3889. IEEE (2012)
Liang, H., Lou, X., Yang, Y., et al.: Learning visual affordances with target-orientated deep Q-network to grasp objects by harnessing environmental fixtures. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 2562–2568. IEEE (2021)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Huang, G., Liu, Z., Van Der Maaten L., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Sifre, L., Mallat S.: Rigid-motion scattering for texture classification. arXiv preprint arXiv:1403.1687 (2014)
Rohmer, E., Singh, S.P.N., Freese, M.: V-REP: a versatile and scalable robot simulation framework. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1321–1326. IEEE (2013)
Schaul, T., Quan, J., Antonoglou, I., et al.: Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015)
Acknowledgments
This work was supported by the National Key R&D Program of China (grant No.: 2022YFB4700400), National Natural Science Foundation of China (grant No.: 62073249).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xu, H., Wang, Q., Min, H. (2024). CCA-MTFCN: A Robotic Pushing-Grasping Collaborative Method Based on Deep Reinforcement Learning. In: Sun, F., Meng, Q., Fu, Z., Fang, B. (eds) Cognitive Systems and Information Processing. ICCSIP 2023. Communications in Computer and Information Science, vol 1918. Springer, Singapore. https://doi.org/10.1007/978-981-99-8018-5_5
Download citation
DOI: https://doi.org/10.1007/978-981-99-8018-5_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8017-8
Online ISBN: 978-981-99-8018-5
eBook Packages: Computer ScienceComputer Science (R0)