Abstract
Object SLAM uses additional semantic information to detect and map objects in the scene, in order to improve the system’s perception and map representation capabilities. Previous methods often use quadrics and cuboids to represent objects, especially in monocular systems. However, their simplistic shapes are insufficient for effectively representing various types of objects, leading to a limitation in the accuracy of object maps and consequently impacting downstream task performance. In this paper, we propose a novel approach for representing objects in monocular SLAM using superquadrics (SQ) with shape parameters. Our method utilizes object appearance and geometry information comprehensively, enabling accurate estimation of object poses and adaptation to various object shapes. Additionally, we propose a lightweight data association strategy to accurately associate semantic observations across multiple views with object landmarks. We implement a monocular semantic SLAM system with real-time performance and conduct comprehensive experiments on public datasets. The results show that our method is able to build accurate object maps and outperforms state-of-the-art methods on object representation.
Similar content being viewed by others
Code or Data Availability
The code will be released upon acceptance. The datasets used are publicly available in:
TUM RGB-D Dataset: https://vision.in.tum.de/data/datasets/rgbd-dataset
ICL-NUIM Dataset: http://www.doc.ic.ac.uk/\(\sim \)ahanda/VaFRIC/iclnuim.html
RGB-D Sences v2 Dataset: https://rgbd-dataset.cs.washington.edu/dataset/rgbd-scenes-v2/.
References
Mur-Artal, R., Tardos, J.D.: ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017)
Pumarola, A., Vakhitov, A., Agudo, A., Sanfeliu, A., Moreno-Noguer, F. (2017) PL-SLAM: Real-time monocular visual SLAM with points and lines. In: 2017 IEEE international conference on robotics and automation, pp. 4503–4508
Yunus, R., Li, Y., Tombari, F. (2021) ManhattanSLAM: Robust planar tracking and mapping leveraging mixture of manhattan frames. In: 2021 IEEE international conference on robotics and automation, pp. 6687–6693
Martins, R., Bersan, D., Campos, M.F., Nascimento, E.R.: Extending maps with semantic and contextual object information for robot navigation: a learning-based framework using visual and depth cues. J. Intell. Robot. Syst. 99, 555–569 (2020)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A. (2016) You Only Look Once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788
He, K., Gkioxari, G., Dollar, P., Girshick, R. (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp. 2961–2969
Qin, Y., Mei, T., Gao, Z., Lin, Z., Song, W., Zhao, X.: RGB-D SLAM in dynamic environments with multilevel semantic mapping. J. Intell. Robot. Syst. 105(4), 90 (2022)
Virgolino Soares, J.C., Gattass, M., Meggiolaro, M.A.: Crowd-SLAM: visual SLAM towards crowded environments using object detection. J. Intell. Robot. Syst. 102(2), 50 (2021)
Wu, Y., Zhang, Y., Zhu, D., Chen, X., Coleman, S., Sun, W., Hu, X., Deng, Z. (2021) Object SLAM-based active mapping and robotic grasping. In: 2021 international conference on 3D vision, pp. 1372–1381
Qian, Z., Fu, J., Xiao, J.: Towards accurate loop closure detection in semantic SLAM with 3D semantic covisibility graphs. IEEE Robot. Autom. Lett. 7(2), 2455–2462 (2022)
Zins, M., Simon, G., Berger, M.-O. (2022) OA-SLAM: Leveraging objects for camera relocalization in visual SLAM. In: 2022 IEEE international symposium on mixed and augmented reality, pp. 720–728
Salas-Moreno, R.F., Newcombe, R.A., Strasdat, H., Kelly, P.H., Davison, A.J. (2013) SLAM++: Simultaneous localisation and mapping at the level of objects. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1352–1359
Yang, S., Scherer, S.: CubeSLAM: Monocular 3-D object SLAM. IEEE Trans. Robot. 35(4), 925–938 (2019)
Wu, Y., Zhang, Y., Zhu, D., Feng, Y., Coleman, S., Kerr, D. (2020) EAO-SLAM: Monocular semi-dense object SLAM based on ensemble data association. In: 2020 IEEE/RSJ international conference on intelligent robots and systems, pp. 4966–4973
Hosseinzadeh, M., Li, K., Latif, Y., Reid, I. (2019) Real-time monocular object-model aware sparse SLAM. In: 2019 IEEE international conference on robotics and automation, pp. 7123–7129
Ok, K., Liu, K., Frey, K., How, J.P., Roy, N. (2019) Robust object-based SLAM for high-speed autonomous navigation. In: 2019 IEEE international conference on robotics and automation, pp. 669–675
Nicholson, L., Milford, M., Sünderhauf, N.: QuadricSLAM: Dual quadrics from object detections as landmarks in object-oriented SLAM. IEEE Robot. Autom. Lett. 4(1), 1–8 (2018)
Runz, M., Buffier, M., Agapito, L. (2018) MaskFusion: Real-time recognition, tracking and reconstruction of multiple moving objects. In: 2018 IEEE international symposium on mixed and augmented reality, pp. 10–20
Sünderhauf, N., Pham, T.T., Latif, Y., Milford, M., Reid, I. (2017) Meaningful maps with object-oriented semantic mapping. In: 2017 IEEE/RSJ international conference on intelligent robots and systems, pp. 5079–5085
Rubino, C., Crocco, M., Del Bue, A.: 3D object localisation from multi-view image detections. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1281–1294 (2017)
Tian, R., Zhang, Y., Feng, Y., Yang, L., Cao, Z., Coleman, S., Kerr, D.: Accurate and robust object SLAM with 3D quadric landmark reconstruction in outdoors. IEEE Robot. Autom. Lett. 7(2), 1534–1541 (2022)
Liao, Z., Hu, Y., Zhang, J., Qi, X., Zhang, X., Wang, W.: SO-SLAM: Semantic object SLAM with scale proportional and symmetrical texture constraints. IEEE Robot. Autom. Lett. 7(2), 4008–4015 (2022)
Hu, Y., Wang, W.: Making parameterization and constrains of object landmark globally consistent via SPD (3) manifold. IEEE Robot. Autom. Lett. 7(3), 6383–6390 (2022)
Rosinol, A., Violette, A., Abate, M., Hughes, N., Chang, Y., Shi, J., Gupta, A., Carlone, L.: Kimera: From SLAM to spatial perception with 3D dynamic scene graphs. Int. J. Robot. Res. 40(12–14), 1510–1546 (2021)
Zhen, W., Yu, H., Hu, Y., Scherer, S. (2022) Unified representation of geometric primitives for Graph-SLAM optimization using decomposed quadrics. In: 2022 IEEE international conference on robotics and automation, pp. 5636–5642
Tschopp, F., Nieto, J., Siegwart, R., Cadena, C. (2021) Superquadric object representation for optimization-based semantic SLAM. arXiv:2109.0962
Bowman, S.L., Atanasov, N., Daniilidis, K., Pappas, G.J. (2017) Probabilistic data association for semantic SLAM. In: 2017 IEEE international conference on robotics and automation, pp. 1722–1729
Doherty, K.J., Baxter, D.P., Schneeweiss, E., Leonard, J.J. (2020) Probabilistic data association via mixture models for robust semantic SLAM. In: 2020 IEEE international conference on robotics and automation, pp. 1098–1104
Qian, Z., Patath, K., Fu, J., Xiao, J. (2021) Semantic SLAM with autonomous object-level data association. In: 2021 IEEE international conference on robotics and automation, pp. 11203–11209
Chen, K., Liu, J., Chen, Q., Wang, Z., Zhang, J.: Accurate object association and pose updating for semantic SLAM. IEEE Trans. Intell. Transp. Syst. 23(12), 25169–25179 (2022)
Iqbal, A., Gans, N.R.: Data association and localization of classified objects in visual SLAM. J. Intell. Robot. Syst. 100(1), 113–130 (2020)
Barr, A.H.: Superquadrics and angle-preserving transformations. IEEE Comput. Graph. Appl. 1(1), 11–23 (1981)
Vaskevicius, N., Birk, A.: Revisiting superquadric fitting: A numerically stable formulation. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 220–233 (2017)
Jaklic, A., Leonardis, A., Solina, F., Solina, F.: Segmentation and recovery of superquadrics. Springer (2000)
Duncan, K., Sarkar, S., Alqasemi, R., Dubey, R. (2013) Multi-scale superquadric fitting for efficient shape and pose recovery of unknown objects. In: 2013 IEEE international conference on robotics and automation, pp. 4238–4243
Makhal, A., Thomas, F., Gracia, A.P. (2018) Grasping unknown objects in clutter by superquadric representation. In: 2018 Second IEEE international conference on robotic computing (IRC):pp. 292–299
Hariri, S., Kind, M.C., Brunner, R.J.: Extended isolation forest. IEEE Trans. Knowl. Data Eng. 33(4), 1479–1489 (2019)
Akinlar, C., Topal, C.: EDLines: A real-time line segment detector with a false detection control. Pattern Recognit. Lett. 32(13), 1633–1642 (2011)
Zhang, Y.: Experimental comparison of superquadric fitting objective functions. Pattern Recognit. Lett. 24(14), 2185–2193 (2003)
Handa, A., Whelan, T., McDonald, J., Davison, A.J. (2014) A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In: 2014 IEEE international conference on robotics and automation, pp. 1524–1531
Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D. (2012) A benchmark for the evaluation of RGB-D SLAM systems. In: 2012 IEEE international conference on intelligent robots and systems, pp. 573–580
Lai, K., Bo, L., Fox, D. (2014) Unsupervised feature learning for 3D scene labeling. In: 2014 IEEE international conference on robotics and automation, pp. 3050–3057
Funding
This work was supported by the National Natural Science Foundation of China (Grant No. 61871074).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Xiao Han collected the data, performed the analysis, and wrote the manuscript. Lu Yang commented on previous versions of the manuscript and critically revised the work. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflicts of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Han, X., Yang, L. SQ-SLAM: Monocular Semantic SLAM Based on Superquadric Object Representation. J Intell Robot Syst 109, 29 (2023). https://doi.org/10.1007/s10846-023-01960-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-023-01960-w