Realization of CUDA-based real-time multi-camera visual SLAM in embedded systems

  • Jincheng Li
  • Guoqing Deng
  • Wen ZhangEmail author
  • Chaofan Zhang
  • Fan Wang
  • Yong Liu
Special Issue Paper


The real-time capability of multi-camera visual simultaneous localization and mapping (SLAM) in embedded systems is vital for robotic autonomous navigation. However, owing to the incredibly time-consuming feature extraction, multi-camera visual SLAM has high computational complexity and is difficult to run in real-time in embedded systems. This study proposes a central processing unit and graphics processing unit (CPU–GPU) combination acceleration strategy for multi-camera visual SLAM to solve the computational complexity problem, improve computational efficiency, and realize real-time running in embedded systems. First, the GPU-based feature extraction acceleration algorithm is introduced for multi-camera visual SLAM to accelerate the time-consuming feature extraction by using compute unified device architecture to parallelize feature extraction algorithm. Then, a CPU-based multi-threading pipelining method that conducts image reading, feature extraction, and tracking concurrently is proposed to improve the computational efficiency of multi-camera visual SLAM by solving the load imbalance problem caused by GPU use and improving the use of computing resources. Extensive experiment results demonstrate that the improved multi-camera visual SLAM has a speed of 15 frames per second in embedded systems and meets the real-time requirement. Moreover, the improved multi-camera visual SLAM is three times faster than the original CPU-based method. Our open-source code can be found online:


Multi-camera system Visual SLAM ORB feature extraction CUDA Embedded system 



The work was supported by Science and Technology Service Network Initiative (KFJSTS-QYZD-097), The key research and development program of Anhui province of china (201904a05020060), Special Foundation of President of the Hefei Institutes of Physical Science (YZJJ2019QN22) and The Key Research Program of the Hefei Institutes of Physical Science during the 13th Five-Year Plan Period (Y97Z031892).


  1. 1.
    Kaess, M., Dellaert, F.: Probabilistic structure matching for visual SLAM with a multi-camera rig. Comput. Vis. Image Underst. 114(2), 286–296 (2010)CrossRefGoogle Scholar
  2. 2.
    Das, A., Kumar, D., El Bably, A., Waslander, S.L.: Taming the north: multi-camera parallel tracking and mapping in snow-laden environments. Field and Service Robotics, pp. 345–359. Springer, Cham (2016)CrossRefGoogle Scholar
  3. 3.
    García, R.O., Valentin, L., Martínez-Carranza, J., Sucar, L.E.: A fast algorithm for robot localization using multiple sensing units. In: Mexican Conference on Pattern Recognition, pp. 248–257. Springer, Cham (2018)Google Scholar
  4. 4.
    Zhao, C., Fan, B., Hu, J., Tian, L., Zhang, Z., Li, S., Pan, Q.: Pose estimation for multi-camera systems. In: 2017 IEEE International Conference on Unmanned Systems, pp. 533–538. IEEE (2017)Google Scholar
  5. 5.
    Reboucas, R.A., Eller, Q.D.C., Habermann, M., Shiguemori, E.H.: Embedded system for visual odometry and localization of moving objects in images acquired by unmanned aerial vehicles. In: 2013 III Brazilian Symposium on Computing Systems Engineering, pp. 35–40. IEEE (2013)Google Scholar
  6. 6.
    Abouzahir, M., Elouardi, A., Bouaziz, S., Latif, R., Tajer, A.: Large-scale monocular FastSLAM2. 0 acceleration on an embedded heterogeneous architecture. EURASIP J. Adv. Signal Process. 2016(1), 88 (2016)CrossRefGoogle Scholar
  7. 7.
    Asadi, K., Ramshankar, H., Pullagurla, H., Bhandare, A., Shanbhag, S., Mehta, P., Kundu, S., Han, K., Lobaton, E., Wu, T.: Building an integrated mobile robotic system for real-time applications in construction. arXiv:1803.01745 (2018)
  8. 8.
    Dine, A., Elouardi, A., Vincke, B., Bouaziz, S.: Graphbased SLAM embedded implementation on low-cost architectures: a practical approach. In: 2015 IEEE International Conference on Robotics and Automation, pp. 4612–4619. IEEE (2015)Google Scholar
  9. 9.
    Harmat, A., Trentini, M., Sharf, I.: Multi-camera tracking and mapping for unmanned aerial vehicles in unstructured environments. J. Intell. Robot. Syst. 78(2), 291–317 (2015)CrossRefGoogle Scholar
  10. 10.
    Schneider, J., Förstner, W.: Real-time accurate geolocalization of a mav with omnidirectional visual odometry and gps. In: European Conference on Computer Vision, pp. 271–282. Springer, Cham (2014)CrossRefGoogle Scholar
  11. 11.
    Schneider, J., Eling, C., Klingbeil, L., Kuhlmann, H., Förstner, W., Stachniss, C.: Fast and effective online pose estimation and mapping for UAVs. In: 2016 IEEE International Conference on Robotics and Automation, pp. 4784–4791. IEEE (2016)Google Scholar
  12. 12.
    Wahrmann, D., Hildebrandt, A.C., Wittmann, R., Sygulla, F., Rixen, D., Buschmann, T.: Fast object approximation for real-time 3D obstacle avoidance with biped robots. In: 2016 IEEE International Conference on Advanced Intelligent Mechatronics, pp. 38–45. IEEE (2016)Google Scholar
  13. 13.
    Cheng, Y., Bai, J., Xiu, C.: Improved RGB-D vision SLAM algorithm for mobile robot. In: 2017 29th Chinese Control And Decision Conference, pp. 5419–5423. IEEE (2017)Google Scholar
  14. 14.
    Li, C., Wei, H., Lan, T.: Research and implementation of 3D SLAM algorithm based on kinect depth sensor. In: International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, pp. 1070–1074. IEEE (2016)Google Scholar
  15. 15.
    Lu, Y., Li, Y., Song, B., Zhang, W., Chen, H., Peng, L.: Parallelizing image feature extraction algorithms on multi-core platforms. J. Parallel Distrib. Comput. 92, 1–14 (2016)CrossRefGoogle Scholar
  16. 16.
    Zhang, N.: Computing optimised parallel speeded-up robust features (p-surf) on multi-core processors. Int. J. Parallel Program. 38(2), 138–158 (2010)CrossRefGoogle Scholar
  17. 17.
    Zhang, Q., Chen, Y., Zhang, Y., Xu, Y.: SIFT implementation and optimization for multi-core systems. In: 2008 IEEE International Symposium on Parallel and Distributed Processing, pp. 1–8. IEEE (2008)Google Scholar
  18. 18.
    Acharya, K.A., Babu, R.V., Vadhiyar, S.S.: A real-time implementation of SIFT using GPU. J. Real-Time Image Process. 14(2), 267–277 (2018)CrossRefGoogle Scholar
  19. 19.
    Mehrez, A., Morgan, A.A., Hemayed, E.E.: Speeding up spatiotemporal feature extraction using GPU. J. Real-Time Image Process. 16(6), 2379–2407 (2019). CrossRefGoogle Scholar
  20. 20.
    Mohammadi, M.S., Rezaeian, M.: Towards affordable computing: SiftCU a simple but elegant GPU-based implementation of SIFT. Int. J. Comput. Appl. 90(7), 30–37 (2014)Google Scholar
  21. 21.
    Yan, W., Shi, X., Yan, X., Wang, L.: Computing Open-SURF on OpenCL and general purpose GPU. Int. J. Adv. Robot. Syst. 10(10), 375 (2013)CrossRefGoogle Scholar
  22. 22.
    Yonglong, Z., Kuizhi, M., Xiang, J., Peixiang, D.: Parallelization and optimization of sift on GPU using CUDA. In: 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, pp. 1351–1358. IEEE (2013)Google Scholar
  23. 23.
    Zhi, X., Yan, J., Hang, Y., Wang, S.: Realization of CUDA-based real-time registration and target localization for high-resolution video images. J. Real-Time Image Process. 16(4), 1025–1036 (2019). CrossRefGoogle Scholar
  24. 24.
    Sanders, J., Kandrot, E.: CUDA by Example: An Introduction to General-Purpose GPU Programming. Addison-Wesley Professional, Boston (2010)Google Scholar
  25. 25.
    Lowe, D.G.: Distinctive image features from scaleinvariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRefGoogle Scholar
  26. 26.
    Calonder, M., Lepetit, V., Strecha, C., Fua, P.: Brief: binary robust independent elementary features. In: European Conference on Computer Vision, pp. 778–792. Springer, Berlin (2010)CrossRefGoogle Scholar
  27. 27.
    Rosten, E., Drummond, T.: Machine learning for highspeed corner detection. In: European Conference on Computer Vision, pp. 430–443. Springer, Berlin (2006)CrossRefGoogle Scholar
  28. 28.
    Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: 2011 IEEE International Conference on Computer Vision, pp. 2564–2571. IEEE (2011)Google Scholar
  29. 29.
    Houben, S., Quenzel, J., Krombach, N., Behnke, S.: Efficient multi-camera visual-inertial SLAM for micro aerial vehicles. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1616–1622. IEEE (2016)Google Scholar
  30. 30.
    Urban, S., Hinz, S.: MultiCol-SLAM-a modular realtime multi-camera slam system (2016)Google Scholar
  31. 31.
    Mur-Artal, R., Tardós, J.D.: Orb-slam2: an open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017)CrossRefGoogle Scholar
  32. 32.
    Mur-Artal, R., Tardós, J.D.: Fast relocalisation and loop closing in keyframe-based SLAM. In: 2014 IEEE International Conference on Robotics and Automation, pp. 846–853. IEEE (2014)Google Scholar
  33. 33.
    Mur-Artal, R., Montiel, J.M.M., Tardós, J.D.: ORBSLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)CrossRefGoogle Scholar
  34. 34.
    Zhaowei, H., Yunzhi, C., Yiyou, J.: ‘ORB-SLAM2 GPU Optimization’.

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  • Jincheng Li
    • 1
    • 2
  • Guoqing Deng
    • 1
  • Wen Zhang
    • 1
    Email author
  • Chaofan Zhang
    • 1
    • 2
  • Fan Wang
    • 1
    • 2
  • Yong Liu
    • 1
  1. 1.Opto-Electronics Applied Technology Research Centre, Institute of Applied Technology, Hefei Institutes of Physical ScienceChinese Academy of SciencesHefeiChina
  2. 2.University of Science and Technology of ChinaHefeiChina

Personalised recommendations