Abstract
Artificial intelligence (AI) is shaping manufacturing to make it smarter, intelligent, and autonomous. Presently, flexible robots have been introduced that collaborate with humans on the shop floor to enhance productivity and efficiency. Object classification and pose estimation in an autonomous robotic system are crucial problems for proper grasping. Extensive research is being conducted to achieve low-cost, computationally efficient, and real-time assessments. However, most of the existing approaches are computationally expensive and constrained to previous knowledge of the 3D structure of an object. This article presents an AI-based solution, which generalizes cuboid- and cylindrical-shaped objects’ grasping in real-time, irrespective of the dimensions. The AI algorithm has achieved an average precision of 89.44% and 82.43% for cuboid- and cylindrical-shaped objects. It is identified without the knowledge of the objects’ 3D model. The pose is estimated in real-time, accurately. The integrated solution has been implemented in a robotic system fitted with two grippers, a conveyor system, and sensors. Results of several experiments have been reported in this article, which validates the solution. The proposed methodology has achieved 100% accuracy during our experiments to grasp objects on the conveyor belt.
Similar content being viewed by others
References
Redmon J, Divvala S, Girshick R, Farhadi A (2015) You Only Look Once: unified, real-time object detection. Comput Vis Pattern Recognit. https://doi.org/10.48550/arXiv.1506.02640
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y et al (2015) SSD: single shot multiBox detector. https://doi.org/10.1007/978-3-319-46448-0_2
Rublee E, Rabaud V, Konolige K, Bradski G (2011) ORB: an efficient alternative to SIFT or SURF. Int Conf Comput Vis IEEE 2564–2571. https://doi.org/10.1109/ICCV.2011.6126544
Karami E, Prasad S, Shehata M (2017) Image matching using SIFT, SURF, BRIEF and ORB: performance comparison for distorted images. Comput Vis Pattern Recognit. https://doi.org/10.48550/arXiv.1710.02726
Lowe DG (1999) Object recognition from local scale-invariant features. Proc IEEE Int Conf Comput Vis 2:1150–1157. https://doi.org/10.1109/ICCV.1999.790410
Kehl W, Manhardt F, Tombari F, Ilic S, Navab N (2017) SSD-6D: Making RGB-based 3D detection and 6D pose estimation great again. Comput Vis Pattern Recognit. https://doi.org/10.48550/arXiv.1711.10006
Song C, Song J, Huang Q (2020) HybridPose: 6D object pose estimation under hybrid representations. Comput. Vis. Pattern Recognit. arXiv:2001.01869. https://doi.org/10.48550/arXiv.2001.01869
Zakharov S, Shugurov I, Ilic S (2019) DPOD: 6D pose object detector and refiner. Comput Vis Pattern Recognit. https://doi.org/10.48550/arXiv.1902.11020
Yang S, Scherer S (2018) CubeSLAM: monocular 3D object SLAM. https://doi.org/10.1109/TRO.2019.2909168
Xiao J, Russell BC, Torralba A (2012) Localizing 3D cuboids in single-view images. Adv Neural Inf Process Syst (NIPS 2012). https://papers.nips.cc/paper/2012/file/58238e9ae2dd305d79c2ebc8c1883422-Paper.pdf
Tekin B, Sinha SN, Fua P (2017) Real-time seamless single shot 6D object pose prediction. https://doi.org/10.1109/CVPR.2018.00038
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Comput Vis Pattern Recognit. https://doi.org/10.48550/arXiv.1506.01497
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. Computer Vis Pattern Recognit. https://doi.org/10.48550/arXiv.1703.06870
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. Comput Vis Pattern Recognition. https://doi.org/10.48550/arXiv.1708.02002
Sreedhar K (2012) Enhancement of images using morphological transformations. Int J Comput Sci Inf Technol 4:33–50. https://doi.org/10.5121/ijcsit.2012.4103
Christopher RW (1998) Perspective transform estimation. https://www.researchgate.net/profile/Christopher-R-Wren/publication/215439543_Perspective_Transform_Estimation/links/56df558708ae9b93f79a948e/Perspective-Transform-Estimation.pdf. (Accessed 26 Jun 2020)
Canny J (1986) A Computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell PAMI 8:679–698. https://doi.org/10.1109/TPAMI.1986.4767851
Lepetit V, Moreno-Noguer F, Fua P (2009) EPnP: an accurate O(n) solution to the PnP problem. Int J Comput Vis 81:155–166. https://doi.org/10.1007/s11263-008-0152-6
Author information
Authors and Affiliations
Contributions
All the authors contributed to the study’s conception and design. The data collection and analysis were performed by Ritam Upadhyay and Abhishek Asi. The conveyor design, analyses, and fabrication were performed by Nidhi Prasad and Pravanjan Nayak. The sensor interaction and robotic implementation were performed by Ritam Upadhyay and Pravanjan Nayak. The first draft of the manuscript was written by Ritam Upadhyay and Debasish Mishra, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the Topical Collection: New Intelligent Manufacturing Technologies through the Integration of Industry 4.0 and Advanced Manufacturing.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Upadhyay, R., Asi, A., Nayak, P. et al. Real-time deep learning–based image processing for pose estimation and object localization in autonomous robot applications. Int J Adv Manuf Technol 127, 1905–1919 (2023). https://doi.org/10.1007/s00170-022-09994-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00170-022-09994-4