Two-Stage vSLAM Loop Closure Detection Based on Sequence Node Matching and Semi-Semantic Autoencoder

Wang, Zhonghua; Peng, Zhen; Guan, Yong; Wu, Lifeng

doi:10.1007/s10846-020-01302-0

Two-Stage vSLAM Loop Closure Detection Based on Sequence Node Matching and Semi-Semantic Autoencoder

Published: 20 January 2021

Volume 101, article number 29, (2021)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Zhonghua Wang¹,
Zhen Peng^1,2,
Yong Guan¹ &
…
Lifeng Wu¹

233 Accesses
4 Citations
Explore all metrics

Abstract

Visual scene understanding and place recognition are the most challenging problems that mobile robots must solve for to achieve autonomous navigation. To reduce the high computational complexity of many global optimal search strategies, a new two-stage loop closure detection (LCD) strategy is developed in this paper. The front-end sequence node level matching (FSNLM) algorithm is based on the local continuity constraint of the motion process, which avoids the blind search for the global optimal match, and matches the image nodes via a sliding window to accurately find the local optimal matching candidate nodesets. In addition, the back-end image level matching (BILM) algorithm combined with an improved semantic model, DeepLab_AE, uses a convolutional neural network (CNN) as a feature detector to extract visual descriptors. It replaces traditional feature detectors that are manually designed by researchers in the computer vision field and cannot be applied to all environments. Finally, the performance of the two-stage LCD algorithm is evaluated on five public datasets, and is compared with the performance of other state-of-the-art algorithms. The evaluation results prove that the proposed method compares favorably with other algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

YOLO-based Object Detection Models: A Review and its Applications

Article 14 March 2024

Data Availability

All data generated or analysed during this study are included in this published article.

References

Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: part I. IEEE Robotics & Automation Magazine. 13(2), 99–110 (2006)
Article Google Scholar
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. IEEE International Conference on Computer Vision. 2, 1470–1477 (2003)
Google Scholar
Snderhauf, N., Protzel, P.: Brief-gist - closing the loop by simple means, in IEEE/RSJ International Conference on Intelligent Robots & Systems, pp. 1234–1241, 2011. DOI. https://doi.org/10.1109/IROS.2011.6094921
Cummins, M., Newman, P.: FAB-MAP: probabilistic localization and mapping in the space of appearance. The International Journal of Robotics Research. 27(6), 647–665 (2008). https://doi.org/10.1177/0278364908090961
Article Google Scholar
Galvez-Lopez, D., Tardos, J.D.: Bags of binary words for fast place recognition in image sequences. IEEE Trans. Robot. 28(5), 1188–1197 (2012)
Article Google Scholar
Lowe, D.G.: Object recognition from local scale-invariant features, vol. 2, pp. 1150–1157. 1999. DOI. https://doi.org/10.1109/ICCV.1999.790410
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: Speeded up robust features, in Computer Vision – ECCV 2006, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 404–417, 2006
Xiang, G., Tao, Z.: Unsupervised learning to detect loops using deep neural networks for visual slam system. Auton. Robot. 41(1), 1–18 (2017)
Article Google Scholar
Clemente, L.A., Davison, A.J., Reid, I.D., Neira, J., Tards, J.D.: Mapping large loops with a single hand-held camera, Proc Robotics Sciences & Systems, pp. 297–304, 2007
Korrapati, H., Mezouar, Y.: Multi-resolution map building and loop closure with omnidirectional images. Auton. Robot. 41(4), 967–987 (2017)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. International Conference on Neural Information Processing Systems. 25, 1097–1105 (2012). https://doi.org/10.1145/3065386
Article Google Scholar
Hou, Y., Zhang, H., Zhou, S.: Convolutional neural network-based image representation for visual loop closure detection, IEEE International Conference on Information and Automation, pp. 2238–2245, 2015
Finman, R., Paull, L., Leonard, J.J.: Toward Object-Based Place Recognition in Dense RGB-D Maps, ICRA workshop on visual place recognition in changing environments, 2015
Linlin, X., Jiashuo, C., Ran, S., Xun, X., Xinying, L.: A survey of image semantics-based visual simultaneous localization and mapping: Application-oriented solutions to autonomous navigation of mobile robots. International Journal of Advanced Robotic Systems. 17 (2020). https://doi.org/10.1177/1729881420919185
Baifan, C., Dian, Y., Chunfa, L., Qian, W.: Loop Closure Detection Based on Multi-Scale Deep Feature Fusion. Applied Sciences. 9(6), 1120 (2019). https://doi.org/10.3390/app9061120
Article Google Scholar
Chen, L.-C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation, arXiv, vol. abs/1706.05587, 2017
Nicosevici, T., Garcia, R.: Automatic visual bag-of-words for online robot navigation and mapping. IEEE Trans. Robot. 28(4), 886–898 (2012)
Article Google Scholar
Rafique Memon, A., Hesheng, W., Hussain, A.: Loop closure detection using supervised and unsupervised deep neural networks for monocular SLAM systems. Robotics & Autonomous Systems. 126, 103470 (2020). https://doi.org/10.1016/j.robot.2020.103470
Article Google Scholar
Nister, D. Stewenius, H.: Scalable recognition with a vocabulary tree, in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recog, vol. 2, pp. 2161–2168, 2006
Cummins, M., Newman, P.: Appearance-only SLAM at large scale with FAB-MAP 2.0. Int. J. Robot. Res. 30(9), 1100–1123 (2011)
Article Google Scholar
Bampis, L., Amanatiadis, A., Gasteratos, A.: Fast loop-closure detection using visual-word-vectors from image sequences. The International Journal of Robotics Research. 37(1), 62–82 (2018). https://doi.org/10.1177/0278364917740639
Article Google Scholar
Lowry, S., Sunderhauf, N., Newman, P., Leonard, J.J., Cox, D., Corke, P., Milford, M.J.: Visual place recognition: a survey. IEEE Trans. Robot. 32(1), 1–19 (2016)
Article Google Scholar
Milford, M.J., Wyeth, G.F.: Seqslam: Visual route-based navigation for sunny summer days and stormy winter nights, in IEEE International Conference on Robotics & Automation, pp. 1643–1649, 2012
Maddern, W., Milford, M., Wyeth, G.: CAT-SLAM: probabilistic localization and mapping using a continuous appearance-based trajectory. Int. J. Robot. Res. 31(4), 429–451 (2012)
Article Google Scholar
Duckett, T., Marsland, S., Shapiro, J., England, S.: Learning globally consistent maps by relaxation. IEEE International Conference on Robotics & Automation. 4, 3841–3846 (2000)
Google Scholar
Filliat, D., Meyer, J.A.: Global localization and topological map- learning for robot navigation, Seventh International Conference on simulation of adaptive behavior : From Animals to Animats (SAB-2002), pp. 131–140, 2002
Davison, A.J., Reid, I.D., Molton, N.D., Olivier, S.: Monoslam: real-time single camera slam. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1052–1067 (2007)
Article Google Scholar
Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: (2011) DTAM: dense tracking and mapping in real-time, International Conference on Computer Vision, pp. 2320–2327. DOI. https://doi.org/10.1109/ICCV.2011.6126513
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11(12), 3371–3408 (2010)
MathSciNet MATH Google Scholar
Sunderhauf, N., Shirazi, S., Dayoub, F., Upcroft, B., Milford, M.: On the performance of convnet features for place recognition, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4297–4304, 2015. DOI. https://doi.org/10.1109/IROS.2015.7353986
Zhe, L., Chuanzhe, S., Shunbo, Z., Wen, C., Hesheng, W., Yun-Hui, L.: SeqLPD: Sequence Matching Enhanced Loop-Closure Detection Based on Large-Scale Point Cloud Description for Self-Driving Vehicles, IEEE/RSJ International Conference on Intelligent Robots & Systems (IROS), pp. 1218–1223, 2019. DOI. https://doi.org/10.1109/IROS40897.2019.8967875
McCormac, J., Handa, A., Davison, A.J., Leutenegger, S.: Semanticfusion: Dense 3d semantic mapping with convolutional neural networks, 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 4628–4635, 2016
Zitnick, C.L., Dollr, P.: Edge Boxes: Locating Object Proposals from Edges, Computer Vision – ECCV 2014, vol. 8693, pp. 391–405, 2014. DOI. https://doi.org/10.1007/978-3-319-10602-1_26
Cascianelli, S., Costante, G., Bellocchio, E., Valigi, P., Fravolini, M., Ciarfuglia, T.: Robust visual semi-semantic loop closure detection by a covisibility graph and cnn features. Robotics & Autonomous Systems. 92, 53–65 (2017). https://doi.org/10.1016/j.robot.2017.03.004
Article Google Scholar
Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of rgb-d slam systems, in IEEE/RSJ International Conference on Intelligent Robots & Systems, pp. 573–580, 2012. DOI. https://doi.org/10.1109/IROS.2012.6385773
Blanco¸ J.L., Moreno, F., Gonzlez-Jimnez, J., A collection of outdoor robotic datasets with centimeter-accuracy ground truth, Autonomous Robots, vol. 27, 2009. DOI. https://doi.org/10.1007/s10514-009-9138-7
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the Kitti dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, 2016. DOI. https://doi.org/10.1109/CVPR.2016.90
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions, Computing Research Repository (CoRR) in arXiv, vol. abs/1511.07122, 2015
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation, in IEEE International Conference on Computer Vision, pp. 1520–1528, 2015. DOI. https://doi.org/10.1109/ICCV.2015.178
Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration, in In VISAPP International Conference on Computer Vision Theory and Applications, pp. 331–340, 2009
Glover, A., Maddern, W., Warren, M., Reid, S., Milford, M., Wyeth, G.: Openfabmap: An open source toolbox for appearance-based loop closure detection, IEEE International Conference on Robotics and Automation, pp. 4730–4735, 2012. DOI. 1109/ICRA.2012.6224843
Glvez-Lpez, D., Tards, J.D.: Real-time loop detection with bags of binary words, in 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 51–58, Seq, 2011. DOI. https://doi.org/10.1109/IROS.2011.6094885
Shan, A., Haogang, Z., Dong, W., Tsintotas, K.A.: Fast and Incremental Loop Closure Detection with Deep Features and Proximity Graphs, arXiv, vol. abs/2010.11703, 2020
Masci, J., Meier, U., Dan, C., Schmidhuber, J.: Stacked convolutional auto-encoders for hierarchical feature extraction, in International Conference on Artificial Neural Networks, vol. 6791, 2011. DOI. https://doi.org/10.1007/978-3-642-21735-7_7
Glorot, G.X., Bordes, A., Bengio, Y.: Domain adaptation for large-scale sentiment classification: A deep learning approach, in Proceedings of the 28th International Conference on International Conference on Machine Learning, pp. 513–520, 2011
Haotong, Q., Ruihao, G., Xianglong, L., Mingzhu, S., Ziran, W., Fengwei, Y., Jingkuan, S.: Forward and Backward Information Retention for Accurate Binary Neural Networks, arXiv, vol. abs/1909.10788, 2020
Jiahui, H., Sheng, Y., Tai-Jiang, M., Shi-Min, H.: ClusterVO: Clustering Moving Instances and Estimating Visual Odometry for Self and Surroundings, arXiv, vol. abs/2003.12980, 2020
Nair, G., Daga, S., Sajnani, R., Ramesh, A., Ahmed Ansari, J., Murthy Jatavallabhula, K., Krishna, K.: Multi-object Monocular SLAM for Dynamic Environments, arXiv, vol. abs/2002.03528, 2020

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (No.61873175, No.71601022), the Key Project B Class of Beijing Natural Science Fund [grant number KZ201710028028]; the Youth Innovative Research Team of Capital Normal University; the academy for multidisciplinary studies academy for multidisciplinary studies of Capital Normal University and the Beijing Youth Talent Support Program [grant number CIT&TCD201804036].

Funding

This study was supported in part by grants from the National Natural Science Foundation of China (No.61873175, No.71601022), the Key Project B Class of Beijing Natural Science Fund [grant number KZ201710028028]; the Youth Innovative Research Team of Capital Normal University; the academy for multidisciplinary studies academy for multidisciplinary studies of Capital Normal University and the Beijing Youth Talent Support Program [grant number CIT&TCD201804036].

Author information

Authors and Affiliations

Information Engineering College, Capital Normal University, Beijing, 100048, China
Zhonghua Wang, Zhen Peng, Yong Guan & Lifeng Wu
Information Management Department, Beijing Institute of Petrochemical Technology, Beijing, 102617, China
Zhen Peng

Authors

Zhonghua Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Peng
View author publications
You can also search for this author in PubMed Google Scholar
Yong Guan
View author publications
You can also search for this author in PubMed Google Scholar
Lifeng Wu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Zhonghua Wang developed a new two-stage loop closure detection algorithm, and was a major contributor in writing the manuscript. Lifeng Wu analyzed and interpreted the experimental data regarding the loop closure detection. Zhen Peng and Yong Guan made constructive comments on the algorithm and checked the manuscript for typos and grammar, which improved the quality of writing. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Lifeng Wu.

Ethics declarations

Ethical Approval

Not applicable.

Consent to Participate

Not applicable.

Consent to Publish

Not applicable.

Competing Interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Z., Peng, Z., Guan, Y. et al. Two-Stage vSLAM Loop Closure Detection Based on Sequence Node Matching and Semi-Semantic Autoencoder. J Intell Robot Syst 101, 29 (2021). https://doi.org/10.1007/s10846-020-01302-0

Download citation

Received: 28 June 2020
Accepted: 17 December 2020
Published: 20 January 2021
DOI: https://doi.org/10.1007/s10846-020-01302-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two-Stage vSLAM Loop Closure Detection Based on Sequence Node Matching and Semi-Semantic Autoencoder

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

Data Availability

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical Approval

Consent to Participate

Consent to Publish

Competing Interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Two-Stage vSLAM Loop Closure Detection Based on Sequence Node Matching and Semi-Semantic Autoencoder

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

Data Availability

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical Approval

Consent to Participate

Consent to Publish

Competing Interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation