Skip to main content
Log in

Two-Stage vSLAM Loop Closure Detection Based on Sequence Node Matching and Semi-Semantic Autoencoder

  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

Visual scene understanding and place recognition are the most challenging problems that mobile robots must solve for to achieve autonomous navigation. To reduce the high computational complexity of many global optimal search strategies, a new two-stage loop closure detection (LCD) strategy is developed in this paper. The front-end sequence node level matching (FSNLM) algorithm is based on the local continuity constraint of the motion process, which avoids the blind search for the global optimal match, and matches the image nodes via a sliding window to accurately find the local optimal matching candidate nodesets. In addition, the back-end image level matching (BILM) algorithm combined with an improved semantic model, DeepLab_AE, uses a convolutional neural network (CNN) as a feature detector to extract visual descriptors. It replaces traditional feature detectors that are manually designed by researchers in the computer vision field and cannot be applied to all environments. Finally, the performance of the two-stage LCD algorithm is evaluated on five public datasets, and is compared with the performance of other state-of-the-art algorithms. The evaluation results prove that the proposed method compares favorably with other algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data Availability

All data generated or analysed during this study are included in this published article.

References

  1. Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: part I. IEEE Robotics & Automation Magazine. 13(2), 99–110 (2006)

    Article  Google Scholar 

  2. Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. IEEE International Conference on Computer Vision. 2, 1470–1477 (2003)

    Google Scholar 

  3. Snderhauf, N., Protzel, P.: Brief-gist - closing the loop by simple means, in IEEE/RSJ International Conference on Intelligent Robots & Systems, pp. 1234–1241, 2011. DOI. https://doi.org/10.1109/IROS.2011.6094921

  4. Cummins, M., Newman, P.: FAB-MAP: probabilistic localization and mapping in the space of appearance. The International Journal of Robotics Research. 27(6), 647–665 (2008). https://doi.org/10.1177/0278364908090961

    Article  Google Scholar 

  5. Galvez-Lopez, D., Tardos, J.D.: Bags of binary words for fast place recognition in image sequences. IEEE Trans. Robot. 28(5), 1188–1197 (2012)

    Article  Google Scholar 

  6. Lowe, D.G.: Object recognition from local scale-invariant features, vol. 2, pp. 1150–1157. 1999. DOI. https://doi.org/10.1109/ICCV.1999.790410

  7. Bay, H., Tuytelaars, T., Van Gool, L.: SURF: Speeded up robust features, in Computer Vision – ECCV 2006, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 404–417, 2006

  8. Xiang, G., Tao, Z.: Unsupervised learning to detect loops using deep neural networks for visual slam system. Auton. Robot. 41(1), 1–18 (2017)

    Article  Google Scholar 

  9. Clemente, L.A., Davison, A.J., Reid, I.D., Neira, J., Tards, J.D.: Mapping large loops with a single hand-held camera, Proc Robotics Sciences & Systems, pp. 297–304, 2007

  10. Korrapati, H., Mezouar, Y.: Multi-resolution map building and loop closure with omnidirectional images. Auton. Robot. 41(4), 967–987 (2017)

    Article  Google Scholar 

  11. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. International Conference on Neural Information Processing Systems. 25, 1097–1105 (2012). https://doi.org/10.1145/3065386

    Article  Google Scholar 

  12. Hou, Y., Zhang, H., Zhou, S.: Convolutional neural network-based image representation for visual loop closure detection, IEEE International Conference on Information and Automation, pp. 2238–2245, 2015

  13. Finman, R., Paull, L., Leonard, J.J.: Toward Object-Based Place Recognition in Dense RGB-D Maps, ICRA workshop on visual place recognition in changing environments, 2015

  14. Linlin, X., Jiashuo, C., Ran, S., Xun, X., Xinying, L.: A survey of image semantics-based visual simultaneous localization and mapping: Application-oriented solutions to autonomous navigation of mobile robots. International Journal of Advanced Robotic Systems. 17 (2020). https://doi.org/10.1177/1729881420919185

  15. Baifan, C., Dian, Y., Chunfa, L., Qian, W.: Loop Closure Detection Based on Multi-Scale Deep Feature Fusion. Applied Sciences. 9(6), 1120 (2019). https://doi.org/10.3390/app9061120

    Article  Google Scholar 

  16. Chen, L.-C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation, arXiv, vol. abs/1706.05587, 2017

  17. Nicosevici, T., Garcia, R.: Automatic visual bag-of-words for online robot navigation and mapping. IEEE Trans. Robot. 28(4), 886–898 (2012)

    Article  Google Scholar 

  18. Rafique Memon, A., Hesheng, W., Hussain, A.: Loop closure detection using supervised and unsupervised deep neural networks for monocular SLAM systems. Robotics & Autonomous Systems. 126, 103470 (2020). https://doi.org/10.1016/j.robot.2020.103470

    Article  Google Scholar 

  19. Nister, D. Stewenius, H.: Scalable recognition with a vocabulary tree, in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recog, vol. 2, pp. 2161–2168, 2006

  20. Cummins, M., Newman, P.: Appearance-only SLAM at large scale with FAB-MAP 2.0. Int. J. Robot. Res. 30(9), 1100–1123 (2011)

    Article  Google Scholar 

  21. Bampis, L., Amanatiadis, A., Gasteratos, A.: Fast loop-closure detection using visual-word-vectors from image sequences. The International Journal of Robotics Research. 37(1), 62–82 (2018). https://doi.org/10.1177/0278364917740639

    Article  Google Scholar 

  22. Lowry, S., Sunderhauf, N., Newman, P., Leonard, J.J., Cox, D., Corke, P., Milford, M.J.: Visual place recognition: a survey. IEEE Trans. Robot. 32(1), 1–19 (2016)

    Article  Google Scholar 

  23. Milford, M.J., Wyeth, G.F.: Seqslam: Visual route-based navigation for sunny summer days and stormy winter nights, in IEEE International Conference on Robotics & Automation, pp. 1643–1649, 2012

  24. Maddern, W., Milford, M., Wyeth, G.: CAT-SLAM: probabilistic localization and mapping using a continuous appearance-based trajectory. Int. J. Robot. Res. 31(4), 429–451 (2012)

    Article  Google Scholar 

  25. Duckett, T., Marsland, S., Shapiro, J., England, S.: Learning globally consistent maps by relaxation. IEEE International Conference on Robotics & Automation. 4, 3841–3846 (2000)

    Google Scholar 

  26. Filliat, D., Meyer, J.A.: Global localization and topological map- learning for robot navigation, Seventh International Conference on simulation of adaptive behavior : From Animals to Animats (SAB-2002), pp. 131–140, 2002

  27. Davison, A.J., Reid, I.D., Molton, N.D., Olivier, S.: Monoslam: real-time single camera slam. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1052–1067 (2007)

    Article  Google Scholar 

  28. Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: (2011) DTAM: dense tracking and mapping in real-time, International Conference on Computer Vision, pp. 2320–2327. DOI. https://doi.org/10.1109/ICCV.2011.6126513

  29. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11(12), 3371–3408 (2010)

    MathSciNet  MATH  Google Scholar 

  30. Sunderhauf, N., Shirazi, S., Dayoub, F., Upcroft, B., Milford, M.: On the performance of convnet features for place recognition, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4297–4304, 2015. DOI. https://doi.org/10.1109/IROS.2015.7353986

  31. Zhe, L., Chuanzhe, S., Shunbo, Z., Wen, C., Hesheng, W., Yun-Hui, L.: SeqLPD: Sequence Matching Enhanced Loop-Closure Detection Based on Large-Scale Point Cloud Description for Self-Driving Vehicles, IEEE/RSJ International Conference on Intelligent Robots & Systems (IROS), pp. 1218–1223, 2019. DOI. https://doi.org/10.1109/IROS40897.2019.8967875

  32. McCormac, J., Handa, A., Davison, A.J., Leutenegger, S.: Semanticfusion: Dense 3d semantic mapping with convolutional neural networks, 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 4628–4635, 2016

  33. Zitnick, C.L., Dollr, P.: Edge Boxes: Locating Object Proposals from Edges, Computer Vision – ECCV 2014, vol. 8693, pp. 391–405, 2014. DOI. https://doi.org/10.1007/978-3-319-10602-1_26

  34. Cascianelli, S., Costante, G., Bellocchio, E., Valigi, P., Fravolini, M., Ciarfuglia, T.: Robust visual semi-semantic loop closure detection by a covisibility graph and cnn features. Robotics & Autonomous Systems. 92, 53–65 (2017). https://doi.org/10.1016/j.robot.2017.03.004

    Article  Google Scholar 

  35. Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of rgb-d slam systems, in IEEE/RSJ International Conference on Intelligent Robots & Systems, pp. 573–580, 2012. DOI. https://doi.org/10.1109/IROS.2012.6385773

  36. Blanco¸ J.L., Moreno, F., Gonzlez-Jimnez, J., A collection of outdoor robotic datasets with centimeter-accuracy ground truth, Autonomous Robots, vol. 27, 2009. DOI. https://doi.org/10.1007/s10514-009-9138-7

  37. Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the Kitti dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)

    Article  Google Scholar 

  38. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, 2016. DOI. https://doi.org/10.1109/CVPR.2016.90

  39. Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions, Computing Research Repository (CoRR) in arXiv, vol. abs/1511.07122, 2015

  40. Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation, in IEEE International Conference on Computer Vision, pp. 1520–1528, 2015. DOI. https://doi.org/10.1109/ICCV.2015.178

  41. Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration, in In VISAPP International Conference on Computer Vision Theory and Applications, pp. 331–340, 2009

  42. Glover, A., Maddern, W., Warren, M., Reid, S., Milford, M., Wyeth, G.: Openfabmap: An open source toolbox for appearance-based loop closure detection, IEEE International Conference on Robotics and Automation, pp. 4730–4735, 2012. DOI. 1109/ICRA.2012.6224843

  43. Glvez-Lpez, D., Tards, J.D.: Real-time loop detection with bags of binary words, in 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 51–58, Seq, 2011. DOI. https://doi.org/10.1109/IROS.2011.6094885

  44. Shan, A., Haogang, Z., Dong, W., Tsintotas, K.A.: Fast and Incremental Loop Closure Detection with Deep Features and Proximity Graphs, arXiv, vol. abs/2010.11703, 2020

  45. Masci, J., Meier, U., Dan, C., Schmidhuber, J.: Stacked convolutional auto-encoders for hierarchical feature extraction, in International Conference on Artificial Neural Networks, vol. 6791, 2011. DOI. https://doi.org/10.1007/978-3-642-21735-7_7

  46. Glorot, G.X., Bordes, A., Bengio, Y.: Domain adaptation for large-scale sentiment classification: A deep learning approach, in Proceedings of the 28th International Conference on International Conference on Machine Learning, pp. 513–520, 2011

  47. Haotong, Q., Ruihao, G., Xianglong, L., Mingzhu, S., Ziran, W., Fengwei, Y., Jingkuan, S.: Forward and Backward Information Retention for Accurate Binary Neural Networks, arXiv, vol. abs/1909.10788, 2020

  48. Jiahui, H., Sheng, Y., Tai-Jiang, M., Shi-Min, H.: ClusterVO: Clustering Moving Instances and Estimating Visual Odometry for Self and Surroundings, arXiv, vol. abs/2003.12980, 2020

  49. Nair, G., Daga, S., Sajnani, R., Ramesh, A., Ahmed Ansari, J., Murthy Jatavallabhula, K., Krishna, K.: Multi-object Monocular SLAM for Dynamic Environments, arXiv, vol. abs/2002.03528, 2020

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (No.61873175, No.71601022), the Key Project B Class of Beijing Natural Science Fund [grant number KZ201710028028]; the Youth Innovative Research Team of Capital Normal University; the academy for multidisciplinary studies academy for multidisciplinary studies of Capital Normal University and the Beijing Youth Talent Support Program [grant number CIT&TCD201804036].

Funding

This study was supported in part by grants from the National Natural Science Foundation of China (No.61873175, No.71601022), the Key Project B Class of Beijing Natural Science Fund [grant number KZ201710028028]; the Youth Innovative Research Team of Capital Normal University; the academy for multidisciplinary studies academy for multidisciplinary studies of Capital Normal University and the Beijing Youth Talent Support Program [grant number CIT&TCD201804036].

Author information

Authors and Affiliations

Authors

Contributions

Zhonghua Wang developed a new two-stage loop closure detection algorithm, and was a major contributor in writing the manuscript. Lifeng Wu analyzed and interpreted the experimental data regarding the loop closure detection. Zhen Peng and Yong Guan made constructive comments on the algorithm and checked the manuscript for typos and grammar, which improved the quality of writing. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Lifeng Wu.

Ethics declarations

Ethical Approval

Not applicable.

Consent to Participate

Not applicable.

Consent to Publish

Not applicable.

Competing Interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Z., Peng, Z., Guan, Y. et al. Two-Stage vSLAM Loop Closure Detection Based on Sequence Node Matching and Semi-Semantic Autoencoder. J Intell Robot Syst 101, 29 (2021). https://doi.org/10.1007/s10846-020-01302-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-020-01302-0

Keywords

Navigation