Skip to main content
Log in

Online object-level SLAM with dual bundle adjustment

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Object-level landmarks enable the SLAM system to construct robust object-keyframe constraints of bundle adjustment and improve the pose estimation performance. In this paper, we present a real-time online object-level SLAM. The dual Bundle Adjustment (BA) optimization method, including high and low frequencies, is proposed to optimize the estimated pose. The High-frequency BA (HBA) module is used to quickly estimate the camera pose by matching landmarks of keyframes and feature points of the current frame. Then, the estimated camera pose is used in the Low-frequency BA (LBA) module to improve the trajectory accuracy. The LBA module integrates the object-level landmarks into the pose graph to optimize the camera pose of local mapping. Moreover, we build an additional object detection thread to extract object 2D bounding boxes online. While this paper improves the data association through the depth projection of point-line features and the Euclidean distance of object centroid. Experimental results show that our proposed algorithm effectively reduce the drift error of camera pose estimation and improve the accuracy by a large margin on different datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. Our code is made publicly available at: https://github.com/Jake755/object_slam.git

References

  1. Mur-Artal R, Tardós JD (2017) Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans Robot 33(5):1255–1262

    Article  Google Scholar 

  2. Engel J, Koltun V, Cremers D (2017) Direct sparse odometry. IEEE Trans Pattern Anal Mach Intell 40(3):611–625

    Article  Google Scholar 

  3. Pumarola A, Vakhitov A, Agudo A, Sanfeliu A, Moreno-Noguer F (2017) Pl-slam: Real-time monocular visual slam with points and lines. In: 2017 IEEE Int Conf Robot Autom (ICRA), pp. 4503–4508 IEEE

  4. Yang S, Scherer S (2019) Monocular object and plane slam in structured environments. IEEE Robot Autom Letters 4(4):3145–3152

    Article  Google Scholar 

  5. Salas-Moreno RF, Newcombe RA, Strasdat H, Kelly PH, Davison AJ (2013) Slam++: Simultaneous localisation and mapping at the level of objects. In: Proceedings of the IEEE Conf Comput Vision Pattern Recog, pp. 1352–1359

  6. Runz M, Buffier M, Agapito L (2018) Maskfusion: Real-time recognition, tracking and reconstruction of multiple moving objects. In: 2018 IEEE Int Symp Mixed and Augmented Real (ISMAR), pp. 1020 IEEE

  7. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE Int Conf Comp Vision, pp. 2961–2969

  8. Qin T, Chen T, Chen Y, Su Q (2020) Avp-slam: Semantic visual mapping and localization for autonomous vehicles in the parking lot. In: 2020 IEEE/RSJ Int Conf Int Robots Sys (IROS), pp. 5939–5945 IEEE

  9. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Int Conf Medical Image Comput Computer-assisted Interv, pp. 234–241 Springer

  10. Nicholson L, Milford M, Sünderhauf N (2018) Quadricslam: Dual quadrics from object detections as landmarks in object-oriented slam. IEEE Robotics and Automation Letters 4(1):1–8

    Article  Google Scholar 

  11. Tian R, Zhang Y, Feng Y, Yang L, Cao Z, Coleman S, Kerr D (2021) Accurate and robust object slam with 3d quadric landmark reconstruction in outdoors. IEEE Robotics and Automation Letters 7(2):1534–1541

    Article  Google Scholar 

  12. Yang S, Scherer S (2019) Cubeslam: Monocular 3-d object slam. IEEE Transactions on Robotics 35(4):925–938

    Article  Google Scholar 

  13. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conf Comput Vision and Pattern Recog, pp. 779–788

  14. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International Conf Comput Vision, pp. 1440–1448

  15. Bolya D, Zhou C, Xiao F, Lee YJ (2019) Yolact: Real-time instance segmentation. In: Proceedings of the IEEE/CVF Int Conf Comput Vision, pp. 9157–9166

  16. Klein G, Murray D (2007) Parallel tracking and mapping for small ar workspaces. In: 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, pp. 225–234 IEEE

  17. Mur-Artal R, Tardós JD (2017) Visual-inertial monocular slam with map reuse. IEEE Robotics and Automation Letters 2(2):796–803

    Article  Google Scholar 

  18. Bescos B, Fácil JM, Civera J, Neira J (2018) Dynaslam: Tracking, mapping, and inpainting in dynamic scenes. IEEE Robotics and Automation Letters 3(4):4076–4083

    Article  Google Scholar 

  19. Gálvez-López D, Tardos JD (2012) Bags of binary words for fast place recognition in image sequences. IEEE Transactions on Robotics 28(5):1188–1197

    Article  Google Scholar 

  20. Bescos B, Campos C, Tardós JD, Neira J (2021) Dynaslam ii: Tightly-coupled multi-object tracking and slam. IEEE robotics and automation letters 6(3):5191–5198

    Article  Google Scholar 

  21. Huang J, Yang S, Mu T-J, Hu SM (2020) Clustervo: Clustering moving instances and estimating visual odometry for self and surroundings. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2168–2177

  22. Rublee E, Rabaud V, Konolige K, Bradski G (2011) Orb: An efficient alternative to sift or surf. In: 2011 International Conference on Computer Vision, pp. 2564–2571 IEEE

  23. Von Gioi RG, Jakubowicz J, Morel J-M, Randall G (2012) Lsd: A linesegment detector. Image Processing On Line 2:35–55

    Article  Google Scholar 

  24. Akinlar C, Topal C (2011) Edlines: A real-time line segment detector with a false detection control. Pattern Recognition Letters 32(13):1633–1642

    Article  Google Scholar 

  25. Fernandes LA, Oliveira MM (2008) Real-time line detection through an improved hough transform voting scheme. Pattern recognition 41(1):299–314

    Article  MATH  Google Scholar 

  26. Lu X, Yao J, Li K, Li L (2015) Cannylines: A parameter-free line segment detector. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 507–511 IEEE

  27. Desolneux A, Moisan L, Morel J-M (2008) The helmholtz principle. A Probabilistic Approach, From Gestalt Theory to Image Analysis, pp 31–45

  28. Andrew AM (2001) Multiple view geometry in computer vision. Kybernetes

  29. Rosten E (2006) Machine learning for very high-speed corner detection. In: Proceedings of the ECCV, vol. 6

  30. Calonder M, Lepetit V, Strecha C, Fua P (2010) Brief: Binary robust independent elementary features. In: European Conference on Computer Vision, Springer pp. 778–792

  31. Sturm J, Engelhard N, Endres F, Burgard W, Cremers D (2012) A benchmark for the evaluation of rgb-d slam systems. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 573–580 IEEE

  32. Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In: 2012 IEEE Conf Comput Vision and Pattern Recog, pp. 3354–3361 IEEE

  33. Xue F, Zhuo G, Huang Z, Fu W, Wu Z, Ang MH (2020) Toward hierarchical self-supervised monocular absolute depth estimation for autonomous driving applications. In: 2020 IEEE/RSJ Int Conf Intell Robots Syst (IROS), pp. 2330–2337 IEEE

  34. Tian R, Zhang Y, Zhu D, Liang S, Coleman S, Kerr D (2021) Accurate and robust scale recovery for monocular visual odometry based on plane geometry. In: 2021 IEEE Int Conf Robot Autom (ICRA), pp. 5296–5302 IEEE

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant U2033218, 61831018, 61802253, in part by Shanghai Local Capacity Enhancement project (No. 21010501500), in part by “Science and Technology Innovation Action Plan” of Shanghai Science and Technology Commission for social development project under Grant 21DZ1204900.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yongbin Gao or Xiaoyan Jiang.

Ethics declarations

Conflict of interest

The corresponding author of this paper is the associate editor of Applied Intelligence.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, J., Gao, Y., Jiang, X. et al. Online object-level SLAM with dual bundle adjustment. Appl Intell 53, 25092–25105 (2023). https://doi.org/10.1007/s10489-023-04854-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04854-4

Keywords

Navigation