Semantic SLAM Based on Joint Constraint in Dynamic Environment

  • Yuliang Tang
  • Yingchun Fan
  • Shaofeng Liu
  • Xin Jing
  • Jintao Yao
  • Hong HanEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11902)


In most existing SLAM (Simultaneous localization and mapping) methods, it is always assumed that the scene is static. Lots of errors would occur when the camera enters a highly dynamic environment. In this paper, we present an efficient and robust visual SLAM system which associates dynamic feature points detection with semantic segmentation. We obtain the stable feature points by the proposed depth constraint. Combined with the semantic information provided by BlitzNet, every image in the sequence is divided into environment region and potential dynamic region. Then, using the fundamental matrix obtained from the environment region to construct epipolar line constraint, dynamic feature points in the potential dynamic region can be identified effectively. We estimate the motion of the camera using the stable static feature points obtained by the joint constraints. In the process of constructing environment map, moving objects are removed while static objects are retained in the map with their semantic information. The proposed system is evaluated both on TUM RGB-D dataset and in real scenes. The results demonstrate that the proposed system can obtain high-accuracy camera moving trajectory in dynamic environment, and eliminate the smear effects in the constructed semantic point cloud map effectively.


SLAM Semantic segmentation Joint constraint Dynamic objects 


  1. 1.
    Engel, J., Schőps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: European Conference on Computer Vision, pp. 834–849 (2014)Google Scholar
  2. 2.
    Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2018)CrossRefGoogle Scholar
  3. 3.
    Mur-Artal, R., Tardós, J.D.: ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2016)CrossRefGoogle Scholar
  4. 4.
    Wolf, D.F., Sukhatme, G.S.: Mobile robot simultaneous localization and mapping in dynamic environments. Auton. Robots. 19(1), 53–65 (2005)CrossRefGoogle Scholar
  5. 5.
    Wang, C.C., Thorpe, C., Thrun, S.: Online simultaneous localization and mapping with detection and tracking of moving objects: theory and results from a ground vehicle in crowded urban areas. In: IEEE International Conference on Robotics and Automation, pp. 842–849 (2003)Google Scholar
  6. 6.
    Bibby, C., Reid, I.: Simultaneous localisation and mapping in dynamic environments (SLAMIDE) with reversible data association. In: Proceedings of Robotics: Science and Systems, pp. 105–112 (2007)Google Scholar
  7. 7.
    Zou, D., Tan, P.: CoSLAM: collaborative visual SLAM in dynamic environments. IEEE Trans. Pattern Anal. Mach. Intell. 35(2), 354–366 (2013)CrossRefGoogle Scholar
  8. 8.
    Rebecq, H., Horstschaefer, T., Scaramuzza, D.: Real-time visual-inertial odometry for event cameras using keyframe-based nonlinear optimization. In: British Machine Vision Conference (2017)Google Scholar
  9. 9.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)Google Scholar
  10. 10.
    Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)CrossRefGoogle Scholar
  11. 11.
    Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241 (2015)Google Scholar
  12. 12.
    Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions (2015). arXiv preprint arXiv:1511.07122
  13. 13.
    Chen, L.C., Papandreou, G., Kokkinos, I.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)CrossRefGoogle Scholar
  14. 14.
    Zhao, H., Shi, J., Qi, X.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)Google Scholar
  15. 15.
    Zhao, H., Qi, X., Shen, X., Shi, J., Jia, J.: ICNet for real-time semantic segmentation on high-resolution images. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 418–434. Springer, Cham (2018). Scholar
  16. 16.
    He, K., Gkioxari, G., Dollár, P.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)Google Scholar
  17. 17.
    Dvornik, N., Shmelkov, K., Mairal, J.: BlitzNet: a real-time deep network for scene understanding. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4154–4162 (2017)Google Scholar
  18. 18.
    McCormac, J., Handa, A., Davison, A.: SemanticFusion: dense 3D semantic mapping with convolutional neural networks. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 4628–4635 (2017)Google Scholar
  19. 19.
    Li, X., Belaroussi, R.: Semi-dense 3D semantic mapping from monocular SLAM (2016). arXiv preprint arXiv:1611.04144
  20. 20.
    Bowman, S.L., Atanasov, N., Daniilidis, K.: Probabilistic data association for semantic SLAM. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1722–1729 (2017)Google Scholar
  21. 21.
    Yu, C., Liu, Z., Liu, X.J.: DS-SLAM: a semantic visual SLAM towards dynamic environments. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1168–1174 (2018)Google Scholar
  22. 22.
    Bescos, B., Fácil, J.M., Civera, J.: DynaSLAM: tracking, mapping, and inpainting in dynamic scenes. IEEE Robot. Autom. Lett. 3(4), 4076–4083 (2018)CrossRefGoogle Scholar
  23. 23.
    Xiang, G., Tao, Z.: Robust RGB-D simultaneous localization and mapping using planar point features. Robot. Auton. Syst. 72, 1–14 (2015)CrossRefGoogle Scholar
  24. 24.
    Sturm, J., Engelhard, N., Endres, F.: A benchmark for the evaluation of RGB-D SLAM systems. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 573–580 (2012)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Yuliang Tang
    • 1
    • 2
  • Yingchun Fan
    • 1
  • Shaofeng Liu
    • 1
  • Xin Jing
    • 2
  • Jintao Yao
    • 2
  • Hong Han
    • 1
    • 2
    Email author
  1. 1.School of Artificial IntelligenceXidian UniversityXi’anChina
  2. 2.Shaanxi Key Laboratory of Integrated and Intelligent NavigationXi’anChina

Personalised recommendations