G-SAM: A Robust One-Shot Keypoint Detection Framework for PnP Based Robot Pose Estimation

Zhong, Xiaopin; Zhu, Wenxuan; Liu, Weixiang; Yi, Jianye; Liu, Chengxiang; Wu, Zongze

doi:10.1007/s10846-023-01957-5

G-SAM: A Robust One-Shot Keypoint Detection Framework for PnP Based Robot Pose Estimation

Regular paper
Published: 22 September 2023

Volume 109, article number 28, (2023)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Xiaopin Zhong^1,2,
Wenxuan Zhu¹,
Weixiang Liu¹,
Jianye Yi¹,
Chengxiang Liu¹ &
…
Zongze Wu^1,2

223 Accesses
Explore all metrics

Abstract

Robot pose estimation plays a fundamental role in various applications involving service and industrial robots. Among the methods used for robot pose estimation from a single image, the Perspective-n-Point (PnP) based approach is widely used due to its popularity and efficiency. An important part of this framework is keypoint detection. However, the current keypoint detection module used for PnP has two problems: Small number of input keypoints and Large error of input keypoints. This paper proposes a Grouping and Soft-ArgMax (G-SAM) framework to address these two problems: First, a simple and powerful Soft-ArgMax module followed by point subset selection is designed to address the problem of small number of input keypoints; Second, a grouping module is introduced, taking into account the texture and spatial structure information of the robot, to solve the problem of large error of input keypoints. Extensive experiments compare our proposed framework with existing state-of-the-art methods on several public datasets and demonstrate that it can provide more reliable, accurate and faster pose estimation for robotic applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pose Estimation with Mismatching Region Detection in Robot Bin Picking

An Efficient and Consistent Solution to the PnP Problem

Image Uncertainty-Based Absolute Camera Pose Estimation with Fibonacci Outlier Elimination

Article 31 January 2019

Data Availability

Not Applicable.

Code Availability

Not Applicable.

References

Yu, X., Li, B., He, W., Feng, Y., Cheng, L., Silvestre, C.: Adaptive-constrained impedance control for human-robot co-transportation. IEEE transactions on cybernetics 52(12), 13237–13249 (2021)
Article Google Scholar
Yu, X., He, W., Li, Q., Li, Y., Li, B.: Human-robot co-carrying using visual and force sensing. IEEE Transactions on Industrial Electronics 68(9), 8657–8666 (2020)
Article Google Scholar
Tao, H., Cheng, L., Qiu, J., Stojanovic, V.: Few shot cross equipment fault diagnosis method based on parameter optimization and feature mertic. Meas. Sci. Technol. 33(11), 115005 (2022)
Article Google Scholar
Cheng, P., Wang, H., Stojanovic, V., Liu, F., He, S., Shi, K.: Dissipativity-based finite-time asynchronous output feedback control for wind turbine system via a hidden markov model. Int. J. Syst. Sci. 53(15), 3177–3189 (2022)
Zhou, C., Tao, H., Chen, Y., Stojanovic, V., Paszke, W.: Robust point-to-point iterative learning control for constrained systems: A minimum energy approach. Int. J. Robust Nonlinear Control 32(18), 10139–10161 (2022)
Article MathSciNet Google Scholar
Zhuang, Z., Tao, H., Chen, Y., Stojanovic, V., Paszke, W.: An optimal iterative learning control approach for linear systems with nonuniform trial lengths under input constraints. IEEE Transactions on Systems, Man, and Cybernetics: Systems (2022)
Dantas, M.S.M., Rodrigues, I.R., Barbosa, G., Bezerra, D., Sadok, D.F., Kelner, J., Marquezini, M., Silva, R., et al.: Fcn-pose: A pruned and quantized cnn for robot pose estimation for constrained devices. arXiv preprint arXiv:2205.13272 (2022)
Rodrigues, I.R., Dantas, M., Oliveira Filho, A.T., Barbosa, G., Bezerra, D., Souza, R., Marquezini, M.V., Endo, P.T., Kelner, J., Sadok, D.: A framework for robotic arm pose estimation and movement prediction based on deep and extreme learning models. The Journal of Supercomputing, 1–30 (2022)
Noguchi, A., Iqbal, U., Tremblay, J., Harada, T., Gallo, O.: Watch it move: Unsupervised discovery of 3d joints for re-posing of articulated objects. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3677–3687 (2022)
Liu, Q., Qiu, W., Wang, W., Hager, G.D., Yuille, A.L.: Nothing but geometric constraints: A model-free method for articulated object pose estimation. arXiv preprint arXiv:2012.00088 (2020)
Sefercik, B.C., Akgun, B.: Learning markerless robot-depth camera calibration and end-effector pose estimation. arXiv preprint arXiv:2212.07567 (2022)
Simoni, A., Pini, S., Borghi, G., Vezzani, R.: Semi-perspective decoupled heatmaps for 3d robot pose estimation from depth maps. IEEE Robotics and Automation Letters 7(4), 11569–11576 (2022)
Bahadir, O., Siebert, J.P., Aragon-Camarasa, G.: A deep learning-based hand-eye calibration approach using a single reference point on a robot manipulator. In: 2022 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 1109–1114 (2022)
Lee, T.E., Tremblay, J., To, T., Cheng, J., Mosier, T., Kroemer, O., Fox, D., Birchfield, S.: Camera-to-robot pose estimation from a single image. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 9426–9432 (2020)
Lambrecht, J.: Robust few-shot pose estimation of articulated robots using monocular cameras and deep-learning-based keypoint detection. In: 2019 7th International Conference on Robot Intelligence Technology and Applications (RiTA), pp. 136–141 (2019)
Labbé, Y., Carpentier, J., Aubry, M., Sivic, J.: Single-view robot pose and joint angle estimation via render & compare. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1654–1663 (2021)
Zuo, Y., Qiu, W., Xie, L., Zhong, F., Wang, Y., Yuille, A.L.: Craves: Controlling robotic arm with a vision-based economic system. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4214–4223 (2019)
Lambrecht, J., Kästner, L.: Towards the usage of synthetic data for marker-less pose estimation of articulated robots in rgb images. In: 2019 19th International Conference on Advanced Robotics (ICAR), pp. 240–247 (2019)
Zheng, Y., Kuang, Y., Sugimoto, S., Astrom, K., Okutomi, M.: Revisiting the pnp problem: A fast, general and optimal solution. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2344–2351 (2013)
Lepetit, V., Moreno-Noguer, F., Fua, P.: Epnp: An accurate o (n) solution to the p n p problem. Int. J. Comput. Vision 81, 155–166 (2009)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. nature 521(7553), 436–444 (2015)
Google Scholar
Xiao, B., Wu, H., Wei, Y.: Simple baselines for human pose estimation and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 466–481 (2018)
Wei, S.-E., Ramakrishna, V., Kanade, T., Sheikh,Y.: Convolutional pose machines. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724–4732 (2016)
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VIII 14, pp. 483–499 (2016)
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019)
Ding, Y., Deng, W., Zheng, Y., Liu, P., Wang, M., Cheng, X., Bao, J., Chen, D., Zeng, M.: I\(^\wedge \) 2r-net: Intra-and inter-human relation network for multi-person pose estimation. arXiv preprint arXiv:2206.10892 (2022)
Kan, Z., Chen, S., Li, Z., He, Z.: Self-constrained inference optimization on structural groups for human pose estimation. In: Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part V, pp. 729–745 (2022)
Toshev, A., Szegedy, C.: Deeppose: Human pose estimation via deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1653–1660 (2014)
Nie, X., Feng, J., Zhang, J., Yan, S.: Single-stage multi-person pose machines. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6951–6960 (2019)
Nibali, A., He, Z., Morgan, S., Prendergast, L.: Numerical coordinate regression with convolutional neural networks. arXiv preprint arXiv:1801.07372 (2018)
Sun, X., Xiao, B., Wei, F., Liang, S., Wei, Y.: Integral human pose regression. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 529–545 (2018)
Fiala, M.: Artag, a fiducial marker system using digital techniques. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 2, 590–596 (2005)
Garrido-Jurado, S., Muñoz-Salinas, R., Madrid-Cuevas, F.J., Marín-Jiménez, M.J.: Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recognition 47(6), 2280–2292 (2014)
Article Google Scholar
Park, F.C., Martin, B.J.: Robot sensor calibration: solving ax= xb on the euclidean group. IEEE Transactions on Robotics and Automation 10(5), 717–721 (1994)
Article Google Scholar
Fassi, I., Legnani, G.: Hand to sensor calibration: A geometrical interpretation of the matrix equation ax= xb. Journal of Robotic Systems 22(9), 497–506 (2005)
Article MATH Google Scholar
Miseikis, J., Knobelreiter, P., Brijacak, I., Yahyanejad, S., Glette, K., Elle, O.J., Torresen, J.: Robot localisation and 3d position estimation using a free-moving camera and cascaded convolutional neural networks. In: 2018 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM), pp. 181–187 (2018)
Miseikis, J., Brijacak, I., Yahyanejad, S., Glette, K., Elle, O.J., Torresen, J.: Multi-objective convolutional neural networks for robot localisation and 3d position estimation in 2d camera images. In: 2018 15th International Conference on Ubiquitous Robots (UR), pp. 597–603 (2018)
Mišeikis, J., Brijacak, I., Yahyanejad, S., Glette, K., Elle, O.J., Torresen, J.: Transfer learning for unseen robot detection and joint estimation on a multi-objective convolutional neural network. In: 2018 IEEE International Conference on Intelligence and Safety for Robotics (ISR), pp. 337–342 (2018)
Mišeikis, J., Brijačak, I., Yahyanejad, S., Glette, K., Elle, O.J., Torresen, J.: Two-stage transfer learning for heterogeneous robot detection and 3d joint position estimation in a 2d camera image using cnn. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 8883–8889 (2019)
Chen, K., Cheng, H.: Posture estimation of articulated robot based on multi-cylinder segmentation. In: 2022 12th International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER), pp. 922–928 (2022)
Tremblay, J., Tyree, S., Mosier, T., Birchfield, S.: Indirect object-to-robot pose estimation from an external monocular rgb camera. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4227–4234 (2020)
Lambrecht, J., Grosenick, P., Meusel, M.: Optimizing keypoint-based single-shot camera-to-robot pose estimation through shape segmentation. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 13843–13849 (2021)
Dimitropoulos, K., Hatzilygeroudis, I., Chatzilygeroudis, K.: A brief survey of sim2real methods for robot learning. Advances in Service and Industrial Robotics: RAAD 2022, 133–140 (2022)
Article Google Scholar
Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment-a modern synthesis. In: Vision Algorithms: Theory and Practice: International Workshop on Vision Algorithms Corfu, Greece, September 21–22, 1999 Proceedings, pp. 298–372 (2000)
Kneip, L., Li, H., Seo, Y.: Upnp: An optimal o (n) solution to the absolute pose problem with universal applicability. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13, pp. 127–142 (2014)
Gu, K., Yang, L., Yao, A.: Removing the bias of integral pose regression. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11067–11076 (2021)
Gu, K., Yang, L., Yao, A.: Dive deeper into integral pose regression. In: International Conference on Learning Representations (2022)
To, T., Tremblay, J., McKay, D., Yamaguchi, Y., Leung, K., Balanon, A., Cheng, J., Hodge, W., Birchfield, S.: Ndds: Nvidia deep learning dataset synthesizer. In: CVPR 2018 Workshop on Real World Challenges and New Benchmarks for Deep Learning in Robotic Vision, Salt Lake City, UT, June, 22 (2018)
Tremblay, J., To, T., Molchanov, A., Tyree, S., Kautz, J., Birchfield, S.: Synthetically trained neural networks for learning human-readable plans from real-world demonstrations. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 5659–5666 (2018)
Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes. arXiv preprint arXiv:1711.00199 (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., Sun, J.: Repvgg: Making vgg-style convnets great again. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13733–13742 (2021)

Download references

Acknowledgements

We would like to thank anonymous reviewers.

Funding

This work was received funding from the Grants of National Key R &D Program of China (2020AAA0108304), the National Natural Science Foundation of China (No.62171288), the Shenzhen University 2035 Program for Excellent Research (00000224) and the Open Research Fund from Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ).

Author information

Authors and Affiliations

College of Mechatronics and Control Engineering, Shenzhen University, #3688 Nanhai Ave, Shenzhen, 518060, China
Xiaopin Zhong, Wenxuan Zhu, Weixiang Liu, Jianye Yi, Chengxiang Liu & Zongze Wu
Guangdong Artificial Intelligence and Digital Economy Laboratory (Shenzhen), Kelian Road, Shenzhen, 518107, China
Xiaopin Zhong & Zongze Wu

Authors

Xiaopin Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Wenxuan Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Weixiang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jianye Yi
View author publications
You can also search for this author in PubMed Google Scholar
Chengxiang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zongze Wu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design,material preparation and the paper writing/review.The first draft of the the manuscript was written by Xiaopin Zhong and revised by Weixiang Liu.The experimental part of the manuscript was mainly completed by Wenxuan Zhu. Jianye Yi, Chengxiang Liu and Zongze Wu contributed to the experimental design and the writing of the manuscript by participating in the discussions and providing valuable insights. All authors commented on previous versions of the manuscript and approved the final manuscript. Supervision was mainly performed by both Xiaopin Zhong and Weixiang Liu.

Corresponding author

Correspondence to Weixiang Liu.

Ethics declarations

Ethics Approval

This is purely a review paper. The Research team involved within this research confirm that no ethical approval is required.

Consent to Participate

Not Applicable.

Consent for Publication

Not Applicable.

Conflict of Interest

Not Applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Xiaopin Zhong and Wenxuan Zhu contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhong, X., Zhu, W., Liu, W. et al. G-SAM: A Robust One-Shot Keypoint Detection Framework for PnP Based Robot Pose Estimation. J Intell Robot Syst 109, 28 (2023). https://doi.org/10.1007/s10846-023-01957-5

Download citation

Received: 05 April 2023
Accepted: 21 August 2023
Published: 22 September 2023
DOI: https://doi.org/10.1007/s10846-023-01957-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

G-SAM: A Robust One-Shot Keypoint Detection Framework for PnP Based Robot Pose Estimation

Abstract

Access this article

Similar content being viewed by others

Pose Estimation with Mismatching Region Detection in Robot Bin Picking

An Efficient and Consistent Solution to the PnP Problem

Image Uncertainty-Based Absolute Camera Pose Estimation with Fibonacci Outlier Elimination

Data Availability

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics Approval

Consent to Participate

Consent for Publication

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

G-SAM: A Robust One-Shot Keypoint Detection Framework for PnP Based Robot Pose Estimation

Abstract

Access this article

Similar content being viewed by others

Pose Estimation with Mismatching Region Detection in Robot Bin Picking

An Efficient and Consistent Solution to the PnP Problem

Image Uncertainty-Based Absolute Camera Pose Estimation with Fibonacci Outlier Elimination

Data Availability

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics Approval

Consent to Participate

Consent for Publication

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation