Skip to main content
Log in

SQ-SLAM: Monocular Semantic SLAM Based on Superquadric Object Representation

  • Regular paper
  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

Object SLAM uses additional semantic information to detect and map objects in the scene, in order to improve the system’s perception and map representation capabilities. Previous methods often use quadrics and cuboids to represent objects, especially in monocular systems. However, their simplistic shapes are insufficient for effectively representing various types of objects, leading to a limitation in the accuracy of object maps and consequently impacting downstream task performance. In this paper, we propose a novel approach for representing objects in monocular SLAM using superquadrics (SQ) with shape parameters. Our method utilizes object appearance and geometry information comprehensively, enabling accurate estimation of object poses and adaptation to various object shapes. Additionally, we propose a lightweight data association strategy to accurately associate semantic observations across multiple views with object landmarks. We implement a monocular semantic SLAM system with real-time performance and conduct comprehensive experiments on public datasets. The results show that our method is able to build accurate object maps and outperforms state-of-the-art methods on object representation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Code or Data Availability

The code will be released upon acceptance. The datasets used are publicly available in:

TUM RGB-D Dataset: https://vision.in.tum.de/data/datasets/rgbd-dataset

ICL-NUIM Dataset: http://www.doc.ic.ac.uk/\(\sim \)ahanda/VaFRIC/iclnuim.html

RGB-D Sences v2 Dataset: https://rgbd-dataset.cs.washington.edu/dataset/rgbd-scenes-v2/.

References

  1. Mur-Artal, R., Tardos, J.D.: ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017)

    Article  Google Scholar 

  2. Pumarola, A., Vakhitov, A., Agudo, A., Sanfeliu, A., Moreno-Noguer, F. (2017) PL-SLAM: Real-time monocular visual SLAM with points and lines. In: 2017 IEEE international conference on robotics and automation, pp. 4503–4508

  3. Yunus, R., Li, Y., Tombari, F. (2021) ManhattanSLAM: Robust planar tracking and mapping leveraging mixture of manhattan frames. In: 2021 IEEE international conference on robotics and automation, pp. 6687–6693

  4. Martins, R., Bersan, D., Campos, M.F., Nascimento, E.R.: Extending maps with semantic and contextual object information for robot navigation: a learning-based framework using visual and depth cues. J. Intell. Robot. Syst. 99, 555–569 (2020)

    Article  Google Scholar 

  5. Redmon, J., Divvala, S., Girshick, R., Farhadi, A. (2016) You Only Look Once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788

  6. He, K., Gkioxari, G., Dollar, P., Girshick, R. (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp. 2961–2969

  7. Qin, Y., Mei, T., Gao, Z., Lin, Z., Song, W., Zhao, X.: RGB-D SLAM in dynamic environments with multilevel semantic mapping. J. Intell. Robot. Syst. 105(4), 90 (2022)

    Article  Google Scholar 

  8. Virgolino Soares, J.C., Gattass, M., Meggiolaro, M.A.: Crowd-SLAM: visual SLAM towards crowded environments using object detection. J. Intell. Robot. Syst. 102(2), 50 (2021)

    Article  Google Scholar 

  9. Wu, Y., Zhang, Y., Zhu, D., Chen, X., Coleman, S., Sun, W., Hu, X., Deng, Z. (2021) Object SLAM-based active mapping and robotic grasping. In: 2021 international conference on 3D vision, pp. 1372–1381

  10. Qian, Z., Fu, J., Xiao, J.: Towards accurate loop closure detection in semantic SLAM with 3D semantic covisibility graphs. IEEE Robot. Autom. Lett. 7(2), 2455–2462 (2022)

  11. Zins, M., Simon, G., Berger, M.-O. (2022) OA-SLAM: Leveraging objects for camera relocalization in visual SLAM. In: 2022 IEEE international symposium on mixed and augmented reality, pp. 720–728

  12. Salas-Moreno, R.F., Newcombe, R.A., Strasdat, H., Kelly, P.H., Davison, A.J. (2013) SLAM++: Simultaneous localisation and mapping at the level of objects. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1352–1359

  13. Yang, S., Scherer, S.: CubeSLAM: Monocular 3-D object SLAM. IEEE Trans. Robot. 35(4), 925–938 (2019)

    Article  Google Scholar 

  14. Wu, Y., Zhang, Y., Zhu, D., Feng, Y., Coleman, S., Kerr, D. (2020) EAO-SLAM: Monocular semi-dense object SLAM based on ensemble data association. In: 2020 IEEE/RSJ international conference on intelligent robots and systems, pp. 4966–4973

  15. Hosseinzadeh, M., Li, K., Latif, Y., Reid, I. (2019) Real-time monocular object-model aware sparse SLAM. In: 2019 IEEE international conference on robotics and automation, pp. 7123–7129

  16. Ok, K., Liu, K., Frey, K., How, J.P., Roy, N. (2019) Robust object-based SLAM for high-speed autonomous navigation. In: 2019 IEEE international conference on robotics and automation, pp. 669–675

  17. Nicholson, L., Milford, M., Sünderhauf, N.: QuadricSLAM: Dual quadrics from object detections as landmarks in object-oriented SLAM. IEEE Robot. Autom. Lett. 4(1), 1–8 (2018)

    Article  Google Scholar 

  18. Runz, M., Buffier, M., Agapito, L. (2018) MaskFusion: Real-time recognition, tracking and reconstruction of multiple moving objects. In: 2018 IEEE international symposium on mixed and augmented reality, pp. 10–20

  19. Sünderhauf, N., Pham, T.T., Latif, Y., Milford, M., Reid, I. (2017) Meaningful maps with object-oriented semantic mapping. In: 2017 IEEE/RSJ international conference on intelligent robots and systems, pp. 5079–5085

  20. Rubino, C., Crocco, M., Del Bue, A.: 3D object localisation from multi-view image detections. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1281–1294 (2017)

    Google Scholar 

  21. Tian, R., Zhang, Y., Feng, Y., Yang, L., Cao, Z., Coleman, S., Kerr, D.: Accurate and robust object SLAM with 3D quadric landmark reconstruction in outdoors. IEEE Robot. Autom. Lett. 7(2), 1534–1541 (2022)

    Article  Google Scholar 

  22. Liao, Z., Hu, Y., Zhang, J., Qi, X., Zhang, X., Wang, W.: SO-SLAM: Semantic object SLAM with scale proportional and symmetrical texture constraints. IEEE Robot. Autom. Lett. 7(2), 4008–4015 (2022)

    Article  Google Scholar 

  23. Hu, Y., Wang, W.: Making parameterization and constrains of object landmark globally consistent via SPD (3) manifold. IEEE Robot. Autom. Lett. 7(3), 6383–6390 (2022)

    Article  Google Scholar 

  24. Rosinol, A., Violette, A., Abate, M., Hughes, N., Chang, Y., Shi, J., Gupta, A., Carlone, L.: Kimera: From SLAM to spatial perception with 3D dynamic scene graphs. Int. J. Robot. Res. 40(12–14), 1510–1546 (2021)

    Article  Google Scholar 

  25. Zhen, W., Yu, H., Hu, Y., Scherer, S. (2022) Unified representation of geometric primitives for Graph-SLAM optimization using decomposed quadrics. In: 2022 IEEE international conference on robotics and automation, pp. 5636–5642

  26. Tschopp, F., Nieto, J., Siegwart, R., Cadena, C. (2021) Superquadric object representation for optimization-based semantic SLAM. arXiv:2109.0962

  27. Bowman, S.L., Atanasov, N., Daniilidis, K., Pappas, G.J. (2017) Probabilistic data association for semantic SLAM. In: 2017 IEEE international conference on robotics and automation, pp. 1722–1729

  28. Doherty, K.J., Baxter, D.P., Schneeweiss, E., Leonard, J.J. (2020) Probabilistic data association via mixture models for robust semantic SLAM. In: 2020 IEEE international conference on robotics and automation, pp. 1098–1104

  29. Qian, Z., Patath, K., Fu, J., Xiao, J. (2021) Semantic SLAM with autonomous object-level data association. In: 2021 IEEE international conference on robotics and automation, pp. 11203–11209

  30. Chen, K., Liu, J., Chen, Q., Wang, Z., Zhang, J.: Accurate object association and pose updating for semantic SLAM. IEEE Trans. Intell. Transp. Syst. 23(12), 25169–25179 (2022)

    Article  Google Scholar 

  31. Iqbal, A., Gans, N.R.: Data association and localization of classified objects in visual SLAM. J. Intell. Robot. Syst. 100(1), 113–130 (2020)

    Article  Google Scholar 

  32. Barr, A.H.: Superquadrics and angle-preserving transformations. IEEE Comput. Graph. Appl. 1(1), 11–23 (1981)

    Article  Google Scholar 

  33. Vaskevicius, N., Birk, A.: Revisiting superquadric fitting: A numerically stable formulation. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 220–233 (2017)

    Article  Google Scholar 

  34. Jaklic, A., Leonardis, A., Solina, F., Solina, F.: Segmentation and recovery of superquadrics. Springer (2000)

  35. Duncan, K., Sarkar, S., Alqasemi, R., Dubey, R. (2013) Multi-scale superquadric fitting for efficient shape and pose recovery of unknown objects. In: 2013 IEEE international conference on robotics and automation, pp. 4238–4243

  36. Makhal, A., Thomas, F., Gracia, A.P. (2018) Grasping unknown objects in clutter by superquadric representation. In: 2018 Second IEEE international conference on robotic computing (IRC):pp. 292–299

  37. Hariri, S., Kind, M.C., Brunner, R.J.: Extended isolation forest. IEEE Trans. Knowl. Data Eng. 33(4), 1479–1489 (2019)

    Article  Google Scholar 

  38. Akinlar, C., Topal, C.: EDLines: A real-time line segment detector with a false detection control. Pattern Recognit. Lett. 32(13), 1633–1642 (2011)

    Article  Google Scholar 

  39. Zhang, Y.: Experimental comparison of superquadric fitting objective functions. Pattern Recognit. Lett. 24(14), 2185–2193 (2003)

    Article  MATH  Google Scholar 

  40. Handa, A., Whelan, T., McDonald, J., Davison, A.J. (2014) A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In: 2014 IEEE international conference on robotics and automation, pp. 1524–1531

  41. Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D. (2012) A benchmark for the evaluation of RGB-D SLAM systems. In: 2012 IEEE international conference on intelligent robots and systems, pp. 573–580

  42. Lai, K., Bo, L., Fox, D. (2014) Unsupervised feature learning for 3D scene labeling. In: 2014 IEEE international conference on robotics and automation, pp. 3050–3057

Download references

Funding

This work was supported by the National Natural Science Foundation of China (Grant No. 61871074).

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Xiao Han collected the data, performed the analysis, and wrote the manuscript. Lu Yang commented on previous versions of the manuscript and critically revised the work. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Lu Yang.

Ethics declarations

Conflicts of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, X., Yang, L. SQ-SLAM: Monocular Semantic SLAM Based on Superquadric Object Representation. J Intell Robot Syst 109, 29 (2023). https://doi.org/10.1007/s10846-023-01960-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-023-01960-w

Keywords

Navigation