Real-time deep learning–based image processing for pose estimation and object localization in autonomous robot applications

Upadhyay, Ritam; Asi, Abhishek; Nayak, Pravanjan; Prasad, Nidhi; Mishra, Debasish; Pal, Surjya K.

doi:10.1007/s00170-022-09994-4

Real-time deep learning–based image processing for pose estimation and object localization in autonomous robot applications

ORIGINAL ARTICLE
Published: 05 September 2022

Volume 127, pages 1905–1919, (2023)
Cite this article

The International Journal of Advanced Manufacturing Technology Aims and scope Submit manuscript

Ritam Upadhyay¹,
Abhishek Asi¹,
Pravanjan Nayak²,
Nidhi Prasad³,
Debasish Mishra⁴ &
…
Surjya K. Pal ORCID: orcid.org/0000-0003-2182-6349⁵

585 Accesses
2 Citations
Explore all metrics

Abstract

Artificial intelligence (AI) is shaping manufacturing to make it smarter, intelligent, and autonomous. Presently, flexible robots have been introduced that collaborate with humans on the shop floor to enhance productivity and efficiency. Object classification and pose estimation in an autonomous robotic system are crucial problems for proper grasping. Extensive research is being conducted to achieve low-cost, computationally efficient, and real-time assessments. However, most of the existing approaches are computationally expensive and constrained to previous knowledge of the 3D structure of an object. This article presents an AI-based solution, which generalizes cuboid- and cylindrical-shaped objects’ grasping in real-time, irrespective of the dimensions. The AI algorithm has achieved an average precision of 89.44% and 82.43% for cuboid- and cylindrical-shaped objects. It is identified without the knowledge of the objects’ 3D model. The pose is estimated in real-time, accurately. The integrated solution has been implemented in a robotic system fitted with two grippers, a conveyor system, and sensors. Results of several experiments have been reported in this article, which validates the solution. The proposed methodology has achieved 100% accuracy during our experiments to grasp objects on the conveyor belt.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

End-to-End Learning of Object Grasp Poses in the Amazon Robotics Challenge

Toward Precise Robotic Grasping by Probabilistic Post-grasp Displacement Estimation

A Fast and Robust Deep Learning Approach for Hand Object Grasping Confirmation

References

Redmon J, Divvala S, Girshick R, Farhadi A (2015) You Only Look Once: unified, real-time object detection. Comput Vis Pattern Recognit. https://doi.org/10.48550/arXiv.1506.02640
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y et al (2015) SSD: single shot multiBox detector. https://doi.org/10.1007/978-3-319-46448-0_2
Rublee E, Rabaud V, Konolige K, Bradski G (2011) ORB: an efficient alternative to SIFT or SURF. Int Conf Comput Vis IEEE 2564–2571. https://doi.org/10.1109/ICCV.2011.6126544
Article Google Scholar
Karami E, Prasad S, Shehata M (2017) Image matching using SIFT, SURF, BRIEF and ORB: performance comparison for distorted images. Comput Vis Pattern Recognit. https://doi.org/10.48550/arXiv.1710.02726
Lowe DG (1999) Object recognition from local scale-invariant features. Proc IEEE Int Conf Comput Vis 2:1150–1157. https://doi.org/10.1109/ICCV.1999.790410
Article Google Scholar
Kehl W, Manhardt F, Tombari F, Ilic S, Navab N (2017) SSD-6D: Making RGB-based 3D detection and 6D pose estimation great again. Comput Vis Pattern Recognit. https://doi.org/10.48550/arXiv.1711.10006
Song C, Song J, Huang Q (2020) HybridPose: 6D object pose estimation under hybrid representations. Comput. Vis. Pattern Recognit. arXiv:2001.01869. https://doi.org/10.48550/arXiv.2001.01869
Zakharov S, Shugurov I, Ilic S (2019) DPOD: 6D pose object detector and refiner. Comput Vis Pattern Recognit. https://doi.org/10.48550/arXiv.1902.11020
Yang S, Scherer S (2018) CubeSLAM: monocular 3D object SLAM. https://doi.org/10.1109/TRO.2019.2909168
Xiao J, Russell BC, Torralba A (2012) Localizing 3D cuboids in single-view images. Adv Neural Inf Process Syst (NIPS 2012). https://papers.nips.cc/paper/2012/file/58238e9ae2dd305d79c2ebc8c1883422-Paper.pdf
Tekin B, Sinha SN, Fua P (2017) Real-time seamless single shot 6D object pose prediction. https://doi.org/10.1109/CVPR.2018.00038
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Comput Vis Pattern Recognit. https://doi.org/10.48550/arXiv.1506.01497
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. Computer Vis Pattern Recognit. https://doi.org/10.48550/arXiv.1703.06870
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. Comput Vis Pattern Recognition. https://doi.org/10.48550/arXiv.1708.02002
Sreedhar K (2012) Enhancement of images using morphological transformations. Int J Comput Sci Inf Technol 4:33–50. https://doi.org/10.5121/ijcsit.2012.4103
Article Google Scholar
Christopher RW (1998) Perspective transform estimation. https://www.researchgate.net/profile/Christopher-R-Wren/publication/215439543_Perspective_Transform_Estimation/links/‌56df558708ae9b93f79a948e/Perspective-Transform-Estimation.pdf. (Accessed 26 Jun 2020)
Canny J (1986) A Computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell PAMI 8:679–698. https://doi.org/10.1109/TPAMI.1986.4767851
Article Google Scholar
Lepetit V, Moreno-Noguer F, Fua P (2009) EPnP: an accurate O(n) solution to the PnP problem. Int J Comput Vis 81:155–166. https://doi.org/10.1007/s11263-008-0152-6
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, Birla Institute of Technology Mesra, Mesra, 835215, India
Ritam Upadhyay & Abhishek Asi
Centre of Excellence in Advanced Manufacturing Technology, Indian Institute of Technology Kharagpur, Kharagpur, 721302, India
Pravanjan Nayak
Department of Mechanical Engineering, Birla Institute of Technology Mesra, Mesra, 835215, India
Nidhi Prasad
Advanced Technology Development Centre, Indian Institute of Technology Kharagpur, Kharagpur, 721302, India
Debasish Mishra
Department of Mechanical Engineering, Indian Institute of Technology Kharagpur, Kharagpur, 721302, India
Surjya K. Pal

Authors

Ritam Upadhyay
View author publications
You can also search for this author in PubMed Google Scholar
Abhishek Asi
View author publications
You can also search for this author in PubMed Google Scholar
Pravanjan Nayak
View author publications
You can also search for this author in PubMed Google Scholar
Nidhi Prasad
View author publications
You can also search for this author in PubMed Google Scholar
Debasish Mishra
View author publications
You can also search for this author in PubMed Google Scholar
Surjya K. Pal
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All the authors contributed to the study’s conception and design. The data collection and analysis were performed by Ritam Upadhyay and Abhishek Asi. The conveyor design, analyses, and fabrication were performed by Nidhi Prasad and Pravanjan Nayak. The sensor interaction and robotic implementation were performed by Ritam Upadhyay and Pravanjan Nayak. The first draft of the manuscript was written by Ritam Upadhyay and Debasish Mishra, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Surjya K. Pal.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the Topical Collection: New Intelligent Manufacturing Technologies through the Integration of Industry 4.0 and Advanced Manufacturing.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Upadhyay, R., Asi, A., Nayak, P. et al. Real-time deep learning–based image processing for pose estimation and object localization in autonomous robot applications. Int J Adv Manuf Technol 127, 1905–1919 (2023). https://doi.org/10.1007/s00170-022-09994-4

Download citation

Received: 07 January 2022
Accepted: 18 August 2022
Published: 05 September 2022
Issue Date: July 2023
DOI: https://doi.org/10.1007/s00170-022-09994-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-time deep learning–based image processing for pose estimation and object localization in autonomous robot applications

Abstract

Access this article

Similar content being viewed by others

End-to-End Learning of Object Grasp Poses in the Amazon Robotics Challenge

Toward Precise Robotic Grasping by Probabilistic Post-grasp Displacement Estimation

A Fast and Robust Deep Learning Approach for Hand Object Grasping Confirmation

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Real-time deep learning–based image processing for pose estimation and object localization in autonomous robot applications

Abstract

Access this article

Similar content being viewed by others

End-to-End Learning of Object Grasp Poses in the Amazon Robotics Challenge

Toward Precise Robotic Grasping by Probabilistic Post-grasp Displacement Estimation

A Fast and Robust Deep Learning Approach for Hand Object Grasping Confirmation

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation