Abstract
Currently, robotic grasping methods based on sparse partial point clouds have attained excellent grasping performance on various objects. However, they often generate wrong grasping candidates due to the lack of geometric information on the object. In this work, we propose a novel and robust sparse shape completion model (TransSC). This model has a transformer-based encoder to explore more point-wise features and a manifold-based decoder to exploit more object details using a segmented partial point cloud as input. Quantitative experiments verify the effectiveness of the proposed shape completion network and demonstrate that our network outperforms existing methods. Besides, TransSC is integrated into a grasp evaluation network to generate a set of grasp candidates. The simulation experiment shows that TransSC improves the grasping generation result compared to the existing shape completion baselines. Furthermore, our robotic experiment shows that with TransSC, the robot is more successful in grasping objects of unknown numbers randomly placed on a support surface.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Code or Data Availability
Our code will be released at https://github.com/turbohiro/TransSC.
References
Achlioptas, P., Diamanti, O., Mitliagkas, I., Guibas, L.: Learning representations and generative models for 3D point clouds. In: International Conference on Machine Learning, pp. 40–49. PMLR (2018)
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Bohg, J., Johnson-Roberson, M., León, B., Felip, J., Gratal, X., Bergström, N., Kragic, D., Morales, A.: Mind the gap-robotic grasping under incomplete observation. In: 2011 IEEE International Conference on Robotics and Automation, pp. 686–693. IEEE (2011)
Breyer, M., Chung, J.J., Ott, L., Siegwart, R., Nieto, J.: Volumetric grasping network: real-time 6 dof grasp detection in clutter. arXiv:2101.01132 (2021)
Chu, F.J., Xu, R., Vela, P.A.: Real-world multiobject, multigrasp detection. IEEE Robot. Autom. Lett. 3(4), 3355–3362 (2018)
Edelsbrunner, H., Kirkpatrick, D., Seidel, R.: On the shape of a set of points in the plane. IEEE Trans. Inf. Theory 29(4), 551–559 (1983)
Fan, H., Su, H., Guibas, L.J.: A point set generation network for 3d object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 605–613 (2017)
Görner, M., Haschke, R., Ritter, H., Zhang, J.: Moveit! Task constructor for task-level motion planning. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 190–196 (2019)
Groueix, T., Fisher, M., Kim, V., Russell, B., Aubry, M.: Atlasnet: a papier-mâché approach to learning 3D surface generation. arXiv:1802.0538411 (2018)
Gualtieri, M., Platt, R.: Robotic pick-and-place with uncertain object instance segmentation and shape completion. arXiv:2101.11605 (2021)
Guo, M.H., Cai, J.X., Liu, Z.N., Mu, T.J., Martin, R.R., Hu, S.M.: Pct: point cloud transformer. arXiv:2012.09688 (2020)
Huang, Z., Yu, Y., Xu, J., Ni, F., Le, X.: Pf-net: point fractal network for 3d point cloud completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7662–7670 (2020)
Kuang, H., Wang, B., An, J., Zhang, M., Zhang, Z.: Voxel-FPN: multi-scale voxel feature aggregation for 3d object detection from lidar point clouds. Sensors 20(3), 704 (2020)
Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. Int. J. Rob. Res. 37(4-5), 421–436 (2018)
Liang, H., Ma, X., Li, S., Görner, M., Tang, S., Fang, B., Sun, F., Zhang, J.: PointnetGPD: detecting grasp configurations from point sets. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 3629–3635. IEEE (2019)
Liu, M., Sheng, L., Yang, S., Shao, J., Hu, S.M.: Morphing and sampling network for dense point cloud completion. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 11596–11603 (2020)
Lundell, J., Verdoja, F., Kyrki, V.: Robust grasp planning over uncertain shape completions. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1526–1532. IEEE (2019)
Lundell, J., Verdoja, F., Kyrki, V.: Beyond Top-Grasps through Scene Completion. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 545–551. IEEE (2020)
Mahler, J., Liang, J., Niyaz, S., Laskey, M., Doan, R., Liu, X., Ojea, J.A., Goldberg, K.: Dex-net 2.0: deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics. arXiv:1703.09312 (2017)
Miller, A.T., Allen, P.K.: Graspit! a versatile simulator for robotic grasping. IEEE Robot. Autom. Mag. 11(4), 110–122 (2004)
Mousavian, A., Eppner, C., Fox, D.: 6-Dof graspnet: variational grasp generation for object manipulation. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2901–2910 (2019)
ten Pas, A., Gualtieri, M., Saenko, K., Platt, R.: Grasp pose detection in point clouds. Int. J. Rob. Res. 36(13-14), 1455–1473 (2017)
ten Pas, A., Platt, R.: Using geometry to detect grasp poses in 3d point clouds. In: Robotics Research, pp. 307–324. Springer International Publishing. https://doi.org/10.1007/978-3-319-51532-8_19 (2018)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652–660 (2017)
Srinivas, A., Lin, T.Y., Parmar, N., Shlens, J., Abbeel, P., Vaswani, A.: Bottleneck transformers for visual recognition. arXiv:2101.11605 (2021)
Tosun, T., Yang, D., Eisner, B., Isler, V., Lee, D.: Robotic grasping through combined image-based grasp proposal and 3d reconstruction. arXiv:2003.01649 (2020)
Varley, J., DeChant, C., Richardson, A., Ruales, J., Allen, P.: Shape completion enabled robotic grasping. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2442–2447 (2017)
Varley, J., Weisz, J., Weiss, J., Allen, P.: Generating multi-fingered robotic grasps via deep learning. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4415–4420. IEEE (2015)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. arXiv:1706.03762 (2017)
Watkins-Valls, D., Varley, J., Allen, P.: Multi-modal geometric learning for grasping and manipulation. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 7339–7345. IEEE (2019)
Wu, C., Chen, J., Cao, Q., Zhang, J., Tai, Y., Sun, L., Jia, K.: Grasp proposal networks: an end-to-end solution for visual learning of robotic grasps. arXiv:2009.12606 (2020)
Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: Posecnn: a convolutional neural network for 6d object pose estimation in cluttered scenes. arXiv:1711.00199 (2017)
Xie, H., Yao, H., Zhou, S., Mao, J., Zhang, S., Sun, W.: Grnet: gridding residual network for dense point cloud completion. In: European Conference on Computer Vision, pp. 365–381. Springer (2020)
Yi, L., Kim, V.G., Ceylan, D., Shen, I.C., Yan, M., Su, H., Lu, C., Huang, Q., Sheffer, A., Guibas, L.: A scalable active framework for region annotation in 3D shape collections. ACM Transactions on Graphics (ToG) 35(6), 1–12 (2016)
Yuan, W., Khot, T., Held, D., Mertz, C., Hebert, M.: Pcn: point completion network. In: 2018 International Conference on 3D Vision (3DV), pp. 728–737. IEEE (2018)
Acknowledgements
We thank Mech-Mind Robotics Company for providing the 3D camera.
Funding
Open Access funding enabled and organized by Projekt DEAL. This research was funded by the German Research Foundation (DFG) and the National Science Foundation of China (NSFC) in project Crossmodal Learning, DFG TRR-169/NSFC, project DEXMAN under grant 410816101 and partially supported by European projects H2020 Ultracept (778602).
Author information
Authors and Affiliations
Contributions
List of all authors: Wenkai Chen, Hongzhuo Liang, Zhaopeng Chen, Fucun Sun and Jianwei Zhang.
All authors contributed to the conception and design of this manuscript. Technical work was conducted by Wenkai Chen and Hongzhuo Liang. The manuscript was revised by Zhaopeng Chen, Fucun Sun and Jianwei Zhang. All authors commented on the manuscript.
Corresponding author
Ethics declarations
Conflict of Interests
There are no conflicts of interest.
Consent for Publication
All participants consented to publish this manuscript.
Additional information
Consent to Participate
All participants consented to involve in the creation of this manuscript.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Chen, W., Liang, H., Chen, Z. et al. Improving Object Grasp Performance via Transformer-Based Sparse Shape Completion. J Intell Robot Syst 104, 45 (2022). https://doi.org/10.1007/s10846-022-01586-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-022-01586-4