Abstract
Recent researches on vision-based self-localization have catalyzed versatile and reliable real-time Visual Simultaneous Localization and Mapping (VSLAM) systems. However, retrieving ground truth, estimating calibration parameters and annotating useful labels all require cumbersome human labor. Moreover, there are lots of object instances in the environments while traditional mapping modules can only estimate 3D information of isolated sparse or semi-dense feature points. To meet the gap between the above requirements, we present a VSLAM method based on a synthetic dataset which can effectively utilize texture-less object instances. We also propose several new evaluation criteria that can fully take advantage of ground truth and annotations from synthetic datasets. The proposed Visual SLAM method includes newly designed feature extraction, matching, localization and mapping modules, which jointly use object features and point features to estimate camera 6-Degrees Of Freedom (6-DOF) poses and do richer map construction. Experiments are conducted using the proposed datasets and criteria with several state-of-the-art VSLAM methods to demonstrate the functionality of our datasets. Owing to the object feature fusion in the co-visibility graph, it can conducts scale aware bundle adjustments to reduce accumulated errors. The advantages of proposed Visual SLAM method are demonstrated through experiments conducted both on synthetic datasets and real-world datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Many works that claim themselves as “Visual Odometry” possess many features similar to SLAM system; therefore we refer both VO and Visual SLAM as “VSLAM” in the following paper.
- 2.
- 3.
- 4.
- 5.
- 6.
References
Engel J, Koltun V, Cremers D (2018) Direct sparse odometry. IEEE Trans Pattern Anal Mach Intell 40(3):611–625
Forster C, Zhang Z, Gassner M, Werlberger M, Scaramuzza D (2017) Svo: semidirect visual odometry for monocular and multicamera systems. IEEE Trans Rob 33(2):249–265
Mur-Artal R, Tardós JD (2017) Orb-slam2: an open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans Rob 33(5):1255–1262
Engel J, Schöps T, Cremers D (2014) Lsd-slam: large-scale direct monocular slam. In: European conference on computer vision. Springer, pp 834–849
Jose Tarrio J, Pedre S (2015) Realtime edge-based visual odometry for a monocular camera. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 702–710
Yang S, Scherer S (2017) Direct monocular odometry using points and lines. In: 2017 IEEE international conference on robotics and automation (ICRA). IEEE, pp 3871–3877
Wang X, Dong W, Zhou M, Li R, Zha H (2016) Edge enhanced direct visual odometry. In: BMVC
Gomez-Ojeda R, Briales J, Gonzalez-Jimenez J (2016) Pl-svo: Semi-direct monocular visual odometry by combining points and line segments. In: 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 4211–4216
Maity S, Saha A, Bhowmick B (2017) Edge slam: edge points based monocular visual slam. In: Proceedings of the IEEE international conference on computer vision, pp 2408–2417
Gomez-Ojeda R, Zuñiga-Noël D, Moreno F-A, Scaramuzza D, Gonzalez-Jimenez J (2017) Pl-slam: a stereo slam system through the combination of points and line segments. arXiv preprint arXiv:1705.09479
Zhou H, Zou D, Pei L, Ying R, Liu P, Yu W (2015) Structslam: visual slam with building structure lines. IEEE Trans Veh Technol 64(4):1364–1375
Li H, Yao J, Bazin J-C, Lu X, Xing Y, Liu K (2018) A monocular slam system leveraging structural regularity in manhattan world. In: 2018 IEEE international conference on robotics and automation (ICRA). IEEE, pp 2518–2525
Ma L, Kerl C, Stückler J, Cremers D (2016) Cpa-slam: consistent plane-model alignment for direct rgb-d slam. In: 2016 IEEE international conference on robotics and automation (ICRA). IEEE, pp 1285–1291
Hsiao M, Westman E, Zhang G, Kaess M (2017) Keyframe-based dense planar slam. In: 2017 IEEE international conference on robotics and automation (ICRA). IEEE, pp 5110–5117
Salas-Moreno RF, Glocken B, Kelly PH, Davison AJ (2014) Dense planar slam. In: IEEE international symposium on mixed and augmented reality (ISMAR). IEEE 2014:157–164
Hsiao M, Westman E, Kaess M (2018) Dense planar-inertial slam with structural constraints. In: 2018 IEEE international conference on robotics and automation (ICRA). IEEE, pp 6521–6528
Liwicki S, Zach C, Miksik O, Torr PH (2016) Coarse-to-fine planar regularization for dense monocular depth estimation. In: European conference on computer vision. Springer, pp 458–474
Nicholson L, Milford M, Sünderhauf N (2019) Quadricslam: dual quadrics from object detections as landmarks in object-oriented slam. IEEE Robot Autom Lett 4(1):1–8
Jablonsky N, Milford M, Sünderhauf N (2018) An orientation factor for object-oriented slam. arXiv preprint arXiv:1809.06977
McCormac J, Clark R, Bloesch M, Davison A, Leutenegger S (2018) Fusion++: volumetric object-level slam. In: 2018 international conference on 3D vision (3DV). IEEE, pp 32–41
Salas-Moreno RF, Newcombe RA, Strasdat H, Kelly PH, Davison AJ (2013) Slam++: simultaneous localisation and mapping at the level of objects. Proceedings of the IEEE conference on computer vision and pattern recognition 2013:1352–1359
Fei X, Soatto S (2018) Visual-inertial object detection and mapping. In: Proceedings of the European conference on computer vision (ECCV), pp 301–317
Hosseinzadeh M, Li K, Latif Y, Reid I (2018) Real-time monocular object-model aware sparse slam. arXiv preprint arXiv:1809.09149
Hosseinzadeh M, Latif Y, Pham T, Suenderhauf N, Reid I (2018) Towards semantic slam: points, planes and objects. arXiv preprint arXiv:1804.09111
Yang S, Scherer S (2018) Cubeslam: monocular 3d object detection and slam without prior models. arXiv preprint arXiv:1806.00557
Li R, Wang S, Long Z, Gu D (2018) Undeepvo: monocular visual odometry through unsupervised deep learning. In: 2018 IEEE international conference on robotics and automation (ICRA). IEEE, pp 7286–7291
Pumarola A, Vakhitov A, Agudo A, Sanfeliu A, Moreno-Noguer F (2017) Pl-slam: real-time monocular visual slam with points and lines. In: 2017 IEEE international conference on robotics and automation (ICRA). IEEE, pp 4503–4508
Qin T, Li P, Shen S (2018) Vins-mono: a robust and versatile monocular visual-inertial state estimator. IEEE Trans Rob 34(4):1004–1020
Zhang J, Singh S (2018) Laser-visual-inertial odometry and mapping with high robustness and low drift. Journal of Field Robotics 35(8):1242–1264
Zhang J, Kaess M, Singh S (2017) A real-time method for depth enhanced visual odometry. Auton Robot 41(1):31–43
Li S-P, Zhang T, Gao X, Wang D, Xian Y (2019) Semi-direct monocular visual and visual-inertial slam with loop closure detection. Robot Auton Syst 112:201–210
López A, Villalonga G, Sellart L, Ros G, Vázquez D, Xu J, Marín J, Mozafari A (2017) Training my car to see using virtual worlds. Image Vis Comput 68:08
Schubert D, Goll T, Demmel N, Usenko V, Stückler J, Cremers D (2018) The tum vi benchmark for evaluating visual-inertial odometry. arXiv preprint arXiv:1804.06120
Carlevaris-Bianco N, Ushani AK, Eustice RM (2016) University of michigan north campus long-term vision and lidar dataset. Int J Robot Res 35(9):1023–1035
Miller M, Chung S-J, Hutchinson S (2018) The visual-inertial canoe dataset. Int J Robot Res 37(1):13–20
Engel J, Usenko V, Cremers D (2016) A photometrically calibrated benchmark for monocular visual odometry. arXiv preprint arXiv:1607.02555
Chen C, Zhao P, Lu CX, Wang W, Markham A, Trigoni N (2018) Oxiod: The dataset for deep inertial odometry. arXiv preprint arXiv:1809.07491
Pfrommer B, Sanket N, Daniilidis K, Cleveland J (2017) Penncosyvio: A challenging visual inertial odometry benchmark. In: 2017 IEEE international conference on robotics and automation (ICRA). IEEE, pp 3847–3854
Blanco-Claraco J-L, Moreno-Dueñas F, González-Jiménez J (2014) The málaga urban dataset: High-rate stereo and lidar in a realistic urban scenario. Int J Robot Res 33(2):207–214
Maddern W, Pascoe G, Linegar C, Newman P (2017) 1 year, 1000 km: the oxford robotcar dataset. Int J Robot Res 36(1):3–15
Cortés S, Solin A, Rahtu E, Kannala J (2018) Advio: an authentic dataset for visual-inertial odometry. In: Proceedings of the European conference on computer vision (ECCV), pp 419–434
Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: the kitti dataset. Int J Robot Res 32(11):1231–1237
Burri M, Nikolic J, Gohl P, Schneider T, Rehder J, Omari S, Achtelik MW, Siegwart R (2016) The euroc micro aerial vehicle datasets. Int J Robot Res 35(10):1157–1163
Li W, Saeedi S, McCormac J, Clark R, Tzoumanikas D, Ye Q, Huang Y, Tang R, Leutenegger S (2018) Interiornet: mega-scale multi-sensor photo-realistic indoor scenes dataset. In: British machine vision conference (BMVC)
Ros G, Sellart L, Materzynska J, Vazquez D, Lopez AM (2016) The synthia dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3234–3243
Li X, Wang K, Tian Y, Yan L, Deng F, Wang F-Y (2018) The paralleleye dataset: a large collection of virtual images for traffic vision research. IEEE Trans Intell Transp Syst 99:1–13
Shah S, Dey D, Lovett C, Kapoor A (2018) Airsim: high-fidelity visual and physical simulation for autonomous vehicles. In: Field and service robotics. Springer, pp 621–635
Qiu W, Zhong F, Zhang Y, Qiao S, Xiao Z, Kim TS, Wang Y (2017) Unrealcv: virtual worlds for computer vision. In: Proceedings of the 2017 ACM on multimedia conference. ACM, pp 1221–1224
Handa A, Whelan T, McDonald J, Davison AJ (2014) A benchmark for rgb-d visual odometry, 3d reconstruction and slam. In: IEEE international conference on robotics and automation (ICRA). IEEE, 1524–1531
Maye J, Furgale P, Siegwart R (2013) Self-supervised calibration for robotic systems. In: 2013 IEEE intelligent vehicles symposium (IV). IEEE, pp 473–480
Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press
Muñoz E, Konishi Y, Murino V, Del Bue A (2016) Fast 6d pose estimation for texture-less objects from a single rgb image. In: 2016 IEEE international conference on robotics and automation (ICRA). IEEE, pp 5623–5630
Imperoli M, Pretto A (2015) D2CO: fast and robust registration of 3d textureless objects using the directional chamfer distance. In: International conference on computer vision systems. Springer, pp 316–328
Sturm J, Engelhard N, Endres F, Burgard W, Cremers D (2012) A benchmark for the evaluation of rgb-d slam systems. In: 2012 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 573–580
Bescós B, Fácil JM, Civera J, Neira J (2018) Dynslam: tracking, mapping and inpainting in dynamic scenes. arXiv preprint arXiv:1806.05620
Zhou H, Ummenhofer B, Brox T (2018) Deeptam: deep tracking and mapping. In: European conference on computer vision (ECCV)
Kar A, Prakash A, Liu M-Y, Cameracci E, Yuan J, Rusiniak M, Acuna D, Torralba A, Fidler S (2019) Meta-sim: learning to generate synthetic datasets. arXiv preprint arXiv:1904.11621
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Dong, Y., Liu, Y., Xu, S. (2023). Visual SLAM for Texture-Less Environment. In: Fan, R., Guo, S., Bocus, M.J. (eds) Autonomous Driving Perception. Advances in Computer Vision and Pattern Recognition. Springer, Singapore. https://doi.org/10.1007/978-981-99-4287-9_8
Download citation
DOI: https://doi.org/10.1007/978-981-99-4287-9_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4286-2
Online ISBN: 978-981-99-4287-9
eBook Packages: Computer ScienceComputer Science (R0)