Abstract
Visual simultaneous localisation and mapping (vSLAM) finds applications for indoor and outdoor navigation that routinely subjects it to visual complexities, particularly mirror reflections. The effect of mirror presence (time visible and its average size in the frame) was hypothesised to impact localisation and mapping performance, with systems using direct techniques expected to perform worse. Thus, a dataset, MirrEnv, of image sequences recorded in mirror environments, was collected, and used to evaluate the performance of existing representative methods. RGBD ORB-SLAM3 and BundleFusion appear to show moderate degradation of absolute trajectory error with increasing mirror duration, whilst the remaining results did not show significantly degraded localisation performance. The mesh maps generated proved to be very inaccurate, with real and virtual reflections colliding in the reconstructions. A discussion is given of the likely sources of error and robustness in mirror environments, outlining future directions for validating and improving vSLAM performance in the presence of planar mirrors. The MirrEnv dataset is available at https://doi.org/10.17035/d.2023.0292477898.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Availability of data and materials
The MirrEnv dataset is available at https://doi.org/10.17035/d.2023.0292477898.
References
Taketomi, T.; Uchiyama, H.; Ikeda, S. Visual SLAM algorithms: A survey from 2010 to 2016. IPSJ Transactions on Computer Vision and Applications Vol. 9, No. 1, 16, 2017.
Mourikis, A. I.; Roumeliotis, S. I. A multi-state constraint Kalman filter for vision-aided inertial navigation. In: Proceedings of the IEEE International Conference on Robotics and Automation, 3565–3572, 2007.
Mur-Artal, R.; Tardós, J. D. Visual-inertial monocular SLAM with map reuse. IEEE Robotics and Automation Letters Vol. 2, No. 2, 796–803, 2017.
Qin, T.; Li, P. L.; Shen, S. J. VINS-mono: A robust and versatile monocular visual-inertial state estimator. IEEE Transactions on Robotics Vol. 34, No. 4, 1004–1020, 2018.
Graeter, J.; Wilczynski, A.; Lauer, M. LIMO: Lidar-monocular visual odometry. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 7872–7879, 2018.
Huang, S. S.; Ma, Z. Y.; Mu, T. J.; Fu, H. B.; Hu, S. M. Lidar-monocular visual odometry using point and line features. In: Proceedings of the IEEE International Conference on Robotics and Automation, 1091–1097, 2020.
Abaspur Kazerouni, I.; Fitzgerald, L.; Dooly, G.; Toal, D. A survey of state-of-the-art on visual SLAM. Expert Systems with Applications Vol. 205, 117734, 2022.
Huang, B. C.; Zhao, J.; Liu, J. B. A survey of simultaneous localization and mapping with an envision in 6G wireless networks. arXiv preprint arXiv:1909.05214, 2019.
Servières, M.; Renaudin, V.; Dupuis, A.; Antigny, N. Visual and visual-inertial SLAM: State of the art, classification, and experimental benchmarking. Journal of Sensors Vol. 2021, 1–26, 2021.
Siegwart, R.; Nourbakhsh, I. R.; Scaramuzza, D. Introduction to Autonomous Mobile Robots, 2nd edn. Cambridge: MIT Press, 2011.
Pretto, A.; Menegatti, E.; Bennewitz, M.; Burgard, W.; Pagello, E. A visual odometry framework robust to motion blur. In: Proceedings of the IEEE International Conference on Robotics and Automation, 2250–2257, 2009.
Lee, H. S.; Kwon, J.; Lee, K. M. Simultaneous localization, mapping and deblurring. In: Proceedings of the International Conference on Computer Vision, 1203–1210, 2011.
Liu, P. D.; Zuo, X. X.; Larsson, V.; Pollefeys, M. MBA-VO: Motion blur aware visual odometry. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 5530–5539, 2021.
Park, S.; Schöps, T.; Pollefeys, M. Illumination change robustness in direct visual SLAM. In: Proceedings of the IEEE International Conference on Robotics and Automation, 4523–4530, 2017.
Huang, J. W.; Liu, S. G. Robust simultaneous localization and mapping in low-light environment. Computer Animation and Virtual Worlds Vol. 30, Nos. 3–4, e1895, 2019.
Huang, J. H.; Yang, S.; Zhao, Z. S.; Lai, Y. K.; Hu, S. M. ClusterSLAM: A SLAM backend for simultaneous rigid body clustering and motion estimation. Computational Visual Media Vol. 7, No. 1, 87–101, 2021.
Ma, P.; Bai, Y.; Zhu, J. N.; Wang, C. J.; Peng, C. DSOD: DSO in dynamic environments. IEEE Access Vol. 7, 178300–178309, 2019.
Rabiee, S.; Biswas, J. IV-SLAM: Introspective vision for simultaneous localization and mapping. In: Proceedings of the 4th Conference on Robot Learning, 1100–1109, 2020.
Zhou, H. Z.; Zou, D. P.; Pei, L.; Ying, R. D.; Liu, P. L.; Yu, W. X. StructSLAM: Visual SLAM with building structure lines. IEEE Transactions on Vehicular Technology Vol. 64, No. 4, 1364–1375, 2015.
Yousif, K.; Bab-Hadiashar, A.; Hoseinnezhad, R. 3D SLAM in texture-less environments using rank order statistics. Robotica Vol. 35, No. 4, 809–831, 2017.
Whelan, T.; Salas-Moreno, R. F.; Glocker, B.; Davison, A. J.; Leutenegger, S. ElasticFusion: Real-time dense SLAM and light source estimation. The International Journal of Robotics Research Vol. 35, No. 14, 1697–1716, 2016.
Yang, N.; von Stumberg, L.; Wang, R.; Cremers, D. D3VO: Deep depth, deep pose and deep uncertainty for monocular visual odometry. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1278–1289, 2020.
Tan, J. Q.; Lin, W. J.; Chang, A. X.; Savva, M. Mirror3D: Depth refinement for mirror surfaces. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15985–15994, 2021.
Park, D.; Park, Y. H. Identifying reflected images from object detector in indoor environment utilizing depth information. IEEE Robotics and Automation Letters Vol. 6, No. 2, 635–642, 2020.
Koch, R.; May, S.; Koch, P.; Kühn, M.; Nüchter, A. Detection of specular reflections in range measurements for faultless robotic SLAM. In: Robot 2015: Second Iberian Robotics Conference. Advances in Intelligent Systems and Computing, Vol. 417. Reis, L.; Moreira, A.; Lima, P.; Montano, L.; Muñoz-Martinez, V. Eds. Springer Cham, 133–145, 2016.
Yang, S. W.; Wang, C. C. Dealing with laser scanner failure: Mirrors and windows. In: Proceedings of the IEEE International Conference on Robotics and Automation, 3009–3015, 2008.
Mur-Artal, R.; Montiel, J. M. M.; Tardós, J. D. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Transactions on Robotics Vol. 31, No. 5, 1147–1163, 2015.
Mur-Artal, R.; Tardós, J. D. ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics Vol. 33, No. 5, 1255–1262, 2017.
Dai, A.; Nießner, M.; Zollhöfer, M.; Izadi, S.; Theobalt, C. BundleFusion: Real-time globally consistent 3D reconstruction using on-the-fly surface reintegration. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 76a, 2017.
Forster, C.; Pizzoli, M.; Scaramuzza, D. SVO: Fast semi-direct monocular visual odometry. In: Proceedings of the IEEE International Conference on Robotics and Automation, 15–22, 2014.
Davison, A. J.; Reid, I. D.; Molton, N. D.; Stasse, O. MonoSLAM: Real-time single camera SLAM. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 29, No. 6, 1052–1067, 2007.
Klein, G.; Murray, D. Parallel tracking and mapping for small AR workspaces. In: Proceedings of the 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, 225–234, 2007.
Tang, J. X.; Folkesson, J.; Jensfelt, P. Geometric correspondence network for camera motion estimation. IEEE Robotics and Automation Letters Vol. 3, No. 2, 1010–1017, 2018.
Tang, J. X.; Ericson, L.; Folkesson, J.; Jensfelt, P. GCNv2: Efficient correspondence prediction for realtime SLAM. IEEE Robotics and Automation Letters Vol. 4, No. 4, 3505–3512, 2019.
Engel, J.; Koltun, V.; Cremers, D. Direct sparse odometry. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 40, No. 3, 611–625, 2017.
Engel, J.; Usenko, V.; Cremers, D. A photometrically calibrated benchmark for monocular visual odometry. arXiv preprint arXiv:1607.02555, 2016.
Schöps, T.; Sattler, T.; Pollefeys, M. BAD SLAM: Bundle adjusted direct RGB-D SLAM. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 134–144, 2019.
Engel, J.; Schöps, T.; Cremers, D. LSD-SLAM: Large-scale direct monocular SLAM. In: Computer Vision–ECCV 2014. Lecture Notes in Computer Science, Vol. 8690. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 834–849, 2014.
Gao, X.; Wang, R.; Demmel, N.; Cremers, D. LDSO: Direct sparse odometry with loop closure. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2198–2204, 2018.
Forster, C.; Zhang, Z. C.; Gassner, M.; Werlberger, M.; Scaramuzza, D. SVO: Semidirect visual odometry for monocular and multicamera systems. IEEE Transactions on Robotics Vol. 33, No. 2, 249–265, 2017.
Kerl, C.; Sturm, J.; Cremers, D. Dense visual SLAM for RGB-D cameras. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2100–2106, 2013.
Engel, J.; Sturm, J.; Cremers, D. Semi-dense visual odometry for a monocular camera. In: Proceedings of the IEEE International Conference on Computer Vision, 1449–1456, 2013.
Whelan, T.; Kaess, M.; Johannsson, H.; Fallon, M.; Leonard, J. J.; McDonald, J. Real-time large-scale dense RGB-D SLAM with volumetric fusion. International Journal of Robotics Research Vol. 34, Nos. 4–5, 598–626, 2015.
Tateno, K.; Tombari, F.; Laina, I.; Navab, N. CNN-SLAM: Real-time dense monocular SLAM with learned depth prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6565–6574, 2017.
Bloesch, M.; Czarnowski, J.; Clark, R.; Leutenegger, S.; Davison, A. J. CodeSLAM - Learning a compact, optimisable representation for dense visual SLAM. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2560–2568, 2018.
Czarnowski, J.; Laidlow, T.; Clark, R.; Davison, A. J. DeepFactors: Real-time probabilistic dense monocular SLAM. IEEE Robotics and Automation Letters Vol. 5, No. 2, 721–728, 2020.
Fuentes-Pacheco, J.; Ruiz-Ascencio, J.; Rendón-Mancha, J. M. Visual simultaneous localization and mapping: A survey. Artificial Intelligence Review Vol. 43, No. 1, 55–81, 2015.
Cadena, C.; Carlone, L.; Carrillo, H.; Latif, Y.; Scaramuzza, D.; Neira, J.; Reid, I.; Leonard, J. J. Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Transactions on Robotics Vol. 32, No. 6, 1309–1332, 2016.
Duan, C.; Junginger, S.; Huang, J. H.; Jin, K. R.; Thurow, K. Deep learning for visual SLAM in transportation robotics: A review. Transportation Safety and Environment Vol. 1, No. 3, 177–184, 2019.
Chen, C. H.; Wang, B.; Lu, C. X.; Trigoni, N.; Markham, A. A survey on deep learning for localization and mapping: Towards the age of spatial machine intelligence. arXiv preprint arXiv:2006.12567, 2020.
Wang, K.; Ma, S.; Chen, J. L.; Ren, F.; Lu, J. B. Approaches, challenges, and applications for deep visual odometry: Toward complicated and emerging areas. IEEE Transactions on Cognitive and Developmental Systems Vol. 14, No. 1, 35–49, 2022.
Sturm, J.; Engelhard, N.; Endres, F.; Burgard, W.; Cremers, D. A benchmark for the evaluation of RGB-D SLAM systems. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 573–580, 2012.
Geiger, A.; Lenz, P.; Urtasun, R. Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3354–3361, 2012.
Burri, M.; Nikolic, J.; Gohl, P.; Schneider, T.; Rehder, J.; Omari, S.; Achtelik, M. W.; Siegwart, R. The EuRoC micro aerial vehicle datasets. International Journal of Robotics Research Vol. 35, No. 10, 1157–1163, 2016.
Dai, A.; Chang, A. X.; Savva, M.; Halber, M.; Funkhouser, T.; Nießner, M. ScanNet: Richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2432–2443, 2017.
Silberman, N.; Hoiem, D.; Kohli, P.; Fergus, R. Indoor segmentation and support inference from RGBD images. In: Computer Vision–ECCV 2012. Lecture Notes in Computer Science, Vol. 7576. Fitzgibbon, A.; Lazebnik, S.; Perona, P.; Sato, Y.; Schmid, C. Eds. Springer Berlin Heidelberg, 746–760, 2012.
Ming, Y.; Ye, W.; Calway, A. iDF-SLAM: End-to-end RGB-D SLAM with neural implicit mapping and deep feature tracking. arXiv preprint arXiv:2209.07919, 2022.
Zhu, Z. H.; Peng, S. Y.; Larsson, V.; Xu, W. W.; Bao, H. J.; Cui, Z. P.; Oswald, M. R.; Pollefeys, M. NICE-SLAM: Neural implicit scalable encoding for SLAM. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12776–12786, 2022.
Handa, A.; Whelan, T.; McDonald, J.; Davison, A. J. A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In: Proceedings of the IEEE International Conference on Robotics and Automation, 1524–1531, 2014.
Straub, J.; Whelan, T.; Ma, L.; Chen, Y.; Wijmans, E.; Green, S.; Engel, J. J.; Mur-Artal, R.; Ren, C.; Verma, S.; et al. The replica dataset: A digital replica of indoor spaces. arXiv preprint arXiv:1906.05797, 2019.
Wang, W. S.; Zhu, D. L.; Wang, X. W.; Hu, Y. Y.; Qiu, Y. H.; Wang, C.; Hu, Y. F.; Kapoor, A.; Scherer, S. TartanAir: A dataset to push the limits of visual SLAM. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 4909–4916, 2020.
Shah, S. M. Z. A.; Marshall, S.; Murray, P. Removal of specular reflections from image sequences using feature correspondences. Machine Vision and Applications Vol. 28, Nos. 3–4, 409–420, 2017.
Sirinukulwattana, T.; Choe, G.; Kweon, I. S. Reflection removal using disparity and gradient-sparsity via smoothing algorithm. In: Proceedings of the IEEE International Conference on Image Processing, 1940–1944, 2015.
DelPozo, A.; Savarese, S. Detecting specular surfaces on natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–8, 2007.
Yang, X.; Mei, H. Y.; Xu, K.; Wei, X. P.; Yin, B. C.; Lau, R. Where is my mirror? In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 8808–8817, 2019.
Lin, J. Y.; Wang, G. D.; Lau, R. W. H. Progressive mirror detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3694–3702, 2020.
Mei, H. Y.; Dong, B.; Dong, W.; Peers, P.; Yang, X.; Zhang, Q.; Wei, X. P. Depth-aware mirror segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3043–3052, 2021.
Whelan, T.; Goesele, M.; Lovegrove, S. J.; Straub, J.; Green, S.; Szeliski, R.; Butterfield, S.; Verma, S.; Newcombe, R. Reconstructing scenes with mirror and glass surfaces. ACM Transactions on Graphics Vol. 37, No. 4, Article No. 102, 2018.
Hart, J. W.; Scassellati, B. Mirror perspective-taking with a humanoid robot. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence, 1990–1996, 2012.
Zeng, Y.; Zhao, Y. X.; Bai, J. Towards robot self-consciousness (I): Brain-inspired robot mirror neuron system model and its application in mirror self-recognition. In: Advances in Brain Inspired Cognitive Systems. Lecture Notes in Computer Science, Vol. 10023. Liu, C. L.; Hussain, A.; Luo, B.; Tan, K.; Zeng, Y.; Zhang, Z. Eds. Springer Cham, 11–21, 2016.
Safeea, M.; Neto, P. KUKA sunrise toolbox: Interfacing collaborative robots with MATLAB. IEEE Robotics & Automation Magazine Vol. 26, No. 1, 91–96, 2019.
Safeea, M.; Neto, P. KUKA sunrise toolbox: Interfacing collaborative robots with MATLAB. IEEE Robotics & Automation Magazine Vol. 26, No. 1, 91–96, 2019.
Shah, M.; Eastman, R. D.; Hong, T. An overview of robot-sensor calibration methods for evaluation of perception systems. In: Proceedings of the Workshop on Performance Metrics for Intelligent Systems, 15–20, 2012.
Tsai, R. Y.; Lenz, R. K. A new technique for fully autonomous and efficient 3D robotics hand/eye calibration. IEEE Transactions on Robotics and Automation Vol. 5, No. 3, 345–358, 1989.
Park, F. C.; Martin, B. J. Robot sensor calibration: Solving AX=XB on the Euclidean group. IEEE Transactions on Robotics and Automation Vol. 10, No. 5, 717–721, 1994.
Andreff, N.; Horaud, R.; Espiau, B. On-line hand-eye calibration. In: Proceedings of the 2nd International Conference on 3-D Digital Imaging and Modeling, 430–436, 1999.
Daniilidis, K. Hand-eye calibration using dual quaternions. The International Journal of Robotics Research Vol. 18, No. 3, 286–298, 1999.
Sharafutdinov, D.; Griguletskii, M.; Kopanev, P.; Kurenkov, M.; Ferrer, G.; Burkov, A.; Gonnochenko, A.; Tsetserukou, D. Comparison of modern open-source visual SLAM approaches. arXiv preprint arXiv:2108.01654, 2021.
Campos, C.; Elvira, R.; Rodríguez, J. J. G.; M Montiel, J. M.; D Tardós, J. ORB-SLAM3: An accurate open-source library for visual, visual–inertial, and multimap SLAM. IEEE Transactions on Robotics Vol. 37, No. 6, 1874–1890, 2021.
Zhao, F. FangGet/bundlefusion_ubuntu_pangolin: Aporting for bundlefusion working on ubuntu, with Pangolin as Visualizer. 2020. Available at https://github.com/FangGet/BundleFusion_Ubuntu_Pangolin.
Zhang, Z. C.; Scaramuzza, D. A tutorial on quantitative trajectory evaluation for visual (-inertial) odometry. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 7244–7251, 2018.
Havasi, L.; Szlavik, Z.; Sziranyi, T. The use of vanishing point for the classification of reflections from foreground mask in videos. IEEE Transactions on Image Processing Vol. 18, No. 6, 1366–1372, 2009.
Acknowledgements
This research was funded by the UK EPSRC through a Doctoral Training Partnership No. EP/T517951/1(2435656).
Author information
Authors and Affiliations
Contributions
Peter Herbert: design of work; acquisition, analysis and interpretation of data; creation of new software used; manuscript drafting and revision. Jing Wu: conception and design of work; analysis and interpretation of data; manuscript drafting and revision. Ze Ji: conception and design of work; acquisition, analysis and interpretation of data; manuscript revision. Yu-Kun Lai: conception and design of work; analysis and interpretation of data; manuscript revision.
Corresponding author
Ethics declarations
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Peter Herbert has his B.Sc. degree in mathematics from the University of Manchester, UK, and his M.Sc. degree in data science and analytics from Cardiff University, UK. His current research interests include computer vision, machine learning, and robot navigation.
Jing Wu is a lecturer in the School of Computer Science and Informatics at Cardiff University. Her research interests are in computer vision and visual analytics. She received her B.Sc. and M.Sc. degrees from Nanjing University, China, and her Ph.D. degree from the University of York, UK. She serves on the editorial board of Displays, and as a Programme Committee member of CGVC, BMVC, etc.
Ze Ji received his B.Eng. degree from Jilin University, China, M.Sc. degree from the University of Birmingham, UK, and Ph.D. degree from Cardiff University. He is currently a senior lecturer in the School of Engineering at Cardiff University. Prior to his current position, he worked in industry (Dyson, Lenovo, etc.) on autonomous robotics. His research interests include autonomous navigation, robot manipulation, robot learning, simultaneous localization and mapping, and tactile sensing.
Yu-Kun Lai is a professor in the School of Computer Science and Informatics, Cardiff University. He received his bachelor and Ph.D. degrees in computer science from Tsinghua University, China, in 2003 and 2008 respectively. His research interests include computer graphics, computer vision, geometric modelling, and image processing.
Electronic supplementary material
Supplementary material, approximately 92.7 MB.
Supplementary material, approximately 94.0 MB.
Supplementary material, approximately 98.4 MB.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.
About this article
Cite this article
Herbert, P., Wu, J., Ji, Z. et al. Benchmarking visual SLAM methods in mirror environments. Comp. Visual Media 10, 215–241 (2024). https://doi.org/10.1007/s41095-022-0329-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41095-022-0329-x