Development of a robust cascaded architecture for intelligent robot grasping using limited labelled data

Shukla, Priya; Kushwaha, Vandana; Nandi, G. C.

doi:10.1007/s00138-023-01459-2

Development of a robust cascaded architecture for intelligent robot grasping using limited labelled data

Original Paper
Published: 13 September 2023

Volume 34, article number 99, (2023)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

159 Accesses
Explore all metrics

Abstract

Grasping objects intelligently is a challenging task even for humans, and we spend a considerable amount of time during our childhood to learn how to grasp objects correctly. In the case of robots, we cannot afford to spend that much time on making it to learn how to grasp objects effectively. Therefore, in the present research we propose an efficient learning architecture based on VQVAE so that robots can be taught with sufficient data corresponding to correct grasping. However, getting sufficient labelled data is extremely difficult in the robot grasping domain. To help solve this problem, a semi-supervised learning-based model, which has much more generalization capability even with limited labelled data set, has been investigated. Its performance shows 6% improvement when compared with existing state-of-the-art models including our earlier model. During experimentation, it has been observed that our proposed model, RGGCNN2, performs significantly better, both in grasping isolated objects as well as objects in a cluttered environment, compared to the existing approaches which do not use unlabelled data for generating grasping rectangles. To the best of our knowledge, developing an intelligent robot grasping model (based on semi-supervised learning) trained through representation learning and exploiting the high-quality learning ability of GGCNN2 architecture with the limited number of labelled dataset together with the learned latent embeddings, can be used as a de-facto training method which has been established and also validated in this paper through rigorous hardware experimentations using Baxter (Anukul) research robot (Video demonstration).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Fig. 3

Fig. 9

Generative model based robotic grasp pose prediction with limited dataset

Article 10 January 2022

Learn to grasp unknown objects in robotic manipulation

Article 18 August 2021

Simultaneous Multi-View Object Recognition and Grasping in Open-Ended Domains

Article Open access 16 April 2024

Abbreviations

VAE::: Variational auto-encoder
VQVAE::: Vector-quantized VAE
CNN::: Convolutional neural network
GGCNN::: Generative grasp CNN
GGCNN2::: Generative grasp CNN-2
RGGCNN::: Representation-based GGCNN
RGGCNN2::: Representation-based GGCNN2

References

Sahbani, A., El-Khoury, S., Bidaud, P.: An overview of 3d object grasp synthesis algorithms. Robot. Auton. Syst. 60, 326–336 (2012). https://doi.org/10.1016/j.robot.2011.07.016
Article Google Scholar
Bohg, J., Morales, A., Asfour, T., Kragic, D.: Data-driven grasp synthesis-a survey. IEEE Trans. Robot. 30(2), 289–309 (2014)
Article Google Scholar
Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. Int. J. Robot. Res. 34(4–5), 705–724 (2015)
Article Google Scholar
Mahler, J., Liang, J., Niyaz, S., Laskey, M., Doan, R., Liu, X., Ojea, J.A., Goldberg, K.: Dex-net 2.0: deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics. CoRR arXiv:1703.09312 (2017)
Pinto, L., Gupta, A.: Supersizing self-supervision: learning to grasp from 50k tries and 700 robot hours. CoRR arXiv:1509.06825 (2015)
Kumra, S., Kanan, C.: Robotic grasp detection using deep convolutional neural networks. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 769–776 (2017). IEEE
Redmon, J., Angelova, A.: Real-time grasp detection using convolutional neural networks. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 1316–1322 (2015). IEEE
Ku, L.Y., Learned-Miller, E.G., Grupen, R.A.: Associating grasping with convolutional neural network features. CoRR arXiv:1609.03947 (2016)
Morrison, D., Corke, P., Leitner, J.: Closing the loop for robotic grasping: a real-time, generative grasp synthesis approach. CoRR arXiv:1804.05172 (2018)
Morrison, D., Corke, P., Leitner, J.: Learning robust, real-time, reactive robotic grasping. Int. J. Robot. Res. 39(2–3), 183–201 (2020). https://doi.org/10.1177/0278364919859066
Article Google Scholar
van den Oord, A., Vinyals, O., Kavukcuoglu, K.: Neural discrete representation learning. CoRR arXiv:1711.00937 (2017)
van den Oord, A., Kalchbrenner, N., Vinyals, O., Espeholt, L., Graves, A., Kavukcuoglu, K.: Conditional image generation with pixelcnn decoders. CoRR arXiv:1606.05328 (2016)
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2. NIPS’14, pp. 2672–2680. MIT Press, Cambridge, MA, USA (2014)
van den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A.W., Kavukcuoglu, K.: Wavenet: a generative model for raw audio. CoRR arXiv:1609.03499 (2016)
Mehri, S., Kumar, K., Gulrajani, I., Kumar, R., Jain, S., Sotelo, J., Courville, A.C., Bengio, Y.: Samplernn: an unconditional end-to-end neural audio generation model. CoRR arXiv:1612.07837 (2016)
Kalchbrenner, N., van den Oord, A., Simonyan, K., Danihelka, I., Vinyals, O., Graves, A., Kavukcuoglu, K.: Video pixel networks. CoRR arXiv:1610.00527 (2016)
Finn, C., Goodfellow, I.J., Levine, S.: Unsupervised learning for physical interaction through video prediction. CoRR arXiv:1605.07157 (2016)
Maitin-Shepard, J., Cusumano-Towner, M., Lei, J., Abbeel, P.: Cloth grasp point detection based on multiple-view geometric cues with application to robotic towel folding. In: 2010 IEEE International Conference on Robotics and Automation, pp. 2308–2315 (2010)
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: Unified, real-time object detection. CoRR arXiv:1506.02640 (2015)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C., Berg, A.C.: SSD: single shot multibox detector. CoRR arXiv:1512.02325 (2015)
Konidaris, G., Kuindersma, S., Grupen, R., Barto, A.: Robot learning from demonstration by constructing skill trees. Int. J. Robot. Res. 31(3), 360–375 (2012). https://doi.org/10.1177/0278364911428653
Article Google Scholar
Levine, S., Pastor, P., Krizhevsky, A., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. CoRR arXiv:1603.02199 (2016)
Viereck, U., ten Pas, A., Saenko, K., Jr., R.P.: Learning a visuomotor controller for real world robotic grasping using easily simulated depth images. CoRR arXiv:1706.04652 (2017)
Schmidt, P., Vahrenkamp, N., Wächter, M., Asfour, T.: Grasping of unknown objects using deep convolutional neural networks based on depth images. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 6831–6838 (2018)
Zeng, A., Song, S., Yu, K., Donlon, E., Hogan, F.R., Bauzá, M., Ma, D., Taylor, O., Liu, M., Romo, E., Fazeli, N., Alet, F., Dafle, N.C., Holladay, R., Morona, I., Nair, P.Q., Green, D., Taylor, I.J., Liu, W., Funkhouser, T.A., Rodriguez, A.: Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching. CoRR arXiv:1710.01330 (2017)
Wei, J., Liu, H., Yan, G., Sun, F.: Robotic grasping recognition using multi-modal deep extreme learning machine. Multidimension. Syst. Signal Process. 28(3), 817–833 (2017)
Article MathSciNet Google Scholar
Wang, J., Hu, Q., Jiang, D.: A Lagrangian network for kinematic control of redundant robot manipulators. IEEE Trans. Neural Netw. 10(5), 1123–1132 (1999)
Article Google Scholar
University, C.: Robot learning lab: Learning to grasp. Available online: http://pr.cs.cornell.edu/grasping/rect_data/data.php
Depierre, A., Dellandréa, E., Chen, L.: Jacquard: a large scale dataset for robotic grasp detection. CoRR arXiv:1803.11469 (2018)
Tchuiev, V., Indelman, V.: Inference over distribution of posterior class probabilities for reliable Bayesian classification and object-level perception. IEEE Robot. Autom. Lett. 3(4), 4329–4336 (2018). https://doi.org/10.1109/LRA.2018.2852844
Article Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes (2014)
Ji, S., Zhang, Z., Ying, S., Wang, L., Zhao, X., Gao, Y.: Kullback–Leibler divergence metric learning. IEEE Trans. Cybern. (2020). https://doi.org/10.1109/TCYB.2020.3008248
Article Google Scholar
Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. CoRR arXiv:1411.4555 (2014)
Ju, Z., Yang, C., Ma, H.: Kinematics modeling and experimental verification of baxter robot. In: Proceedings of the 33rd Chinese Control Conference, pp. 8518–8523 (2014). https://doi.org/10.1109/ChiCC.2014.6896430
Mahajan, M., Bhattacharjee, T., Krishnan, A., Shukla, P., Nandi, G.C.: Robotic grasp detection by learning representation in a vector quantized manifold. In: 2020 International Conference on Signal Processing and Communications (SPCOM), pp. 1–5 (2020)
Wang, Z., Li, Z., Wang, B., Liu, H.: Robot grasp detection using multimodal deep convolutional neural networks. Adv. Mech. Eng. 8(9), 1687814016668077 (2016)
Article Google Scholar
Asif, U., Tang, J., Harrer, S.: Ensemblenet: improving grasp detection using an ensemble of convolutional neural networks. In: BMVC, p. 10 (2018)
Karaoguz, H., Jensfelt, P.: Object detection approach for robot grasp detection. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 4953–4959 (2019). IEEE
Asif, U., Tang, J., Harrer, S.: Graspnet: an efficient convolutional neural network for real-time grasp detection for low-powered devices. In: IJCAI, vol. 7, pp. 4875–4882 (2018)
Guo, D., Sun, F., Liu, H., Kong, T., Fang, B., Xi, N.: A hybrid deep architecture for robotic grasp detection. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 1609–1614 (2017). IEEE
Shukla, P., Mahajan, M., Bhattacharjee, T., Krishnan, A., Nandi, G.C.: Robotic grasp detection by learning representation in a vector quantized manifold. In: CVPR, Women in Computer Vision Workshop (Poster) (2020)
Ying, Z., Li, G., Zang, X., Wang, R., Wang, W.: A novel shadow-free feature extractor for real-time road detection. In: Proceedings of the 24th ACM international conference on Multimedia, pp. 611–615 (2016). https://doi.org/10.1145/2964284.2967294
Huang, J.-B., Chen, C.-S.: Moving cast shadow detection using physics-based features. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2310–2317 (2009). https://doi.org/10.1109/CVPR.2009.5206629
Healey, G.: Segmenting images using normalized color. IEEE Trans. Syst. Man Cybern. 22(1), 64–73 (1992). https://doi.org/10.1109/21.141311
Article Google Scholar
Ying, Z., Li, G., Wen, S., Tan, G.: ORGB: offset correction in RGB color space for illumination-robust image processing. CoRR arXiv:1708.00975 (2017)

Download references

Acknowledgements

The present research is partially funded by the I-Hub foundation for Cobotics (Technology Innovation Hub of IIT-Delhi set up by the Department of Science and Technology, Govt. of India).

Author information

Authors and Affiliations

Center of Intelligent Robotics, Indian Institute of Information Technology Allahabad, Prayagraj, U.P., 211015, India
Priya Shukla, Vandana Kushwaha & G. C. Nandi

Authors

Priya Shukla
View author publications
You can also search for this author in PubMed Google Scholar
Vandana Kushwaha
View author publications
You can also search for this author in PubMed Google Scholar
G. C. Nandi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Priya Shukla.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shukla, P., Kushwaha, V. & Nandi, G.C. Development of a robust cascaded architecture for intelligent robot grasping using limited labelled data. Machine Vision and Applications 34, 99 (2023). https://doi.org/10.1007/s00138-023-01459-2

Download citation

Received: 21 December 2021
Revised: 26 January 2023
Accepted: 23 August 2023
Published: 13 September 2023
DOI: https://doi.org/10.1007/s00138-023-01459-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Development of a robust cascaded architecture for intelligent robot grasping using limited labelled data

Abstract

Access this article

Similar content being viewed by others

Generative model based robotic grasp pose prediction with limited dataset

Learn to grasp unknown objects in robotic manipulation

Simultaneous Multi-View Object Recognition and Grasping in Open-Ended Domains

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Development of a robust cascaded architecture for intelligent robot grasping using limited labelled data

Abstract

Access this article

Similar content being viewed by others

Generative model based robotic grasp pose prediction with limited dataset

Learn to grasp unknown objects in robotic manipulation

Simultaneous Multi-View Object Recognition and Grasping in Open-Ended Domains

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation