3D Research

, 8:16 | Cite as

A Hierarchical Optimization Algorithm Based on GPU for Real-Time 3D Reconstruction

3DR Express


In machine vision sensing system, it is important to realize high-quality real-time 3D reconstruction in large-scale scene. The recent online approach performed well, but scaling up the reconstruction, it causes pose estimation drift, resulting in the cumulative error, usually requiring a large number of off-line operation to completely correct the error, reducing the reconstruction performance. In order to optimize the traditional volume fusion method and improve the old frame-to-frame pose estimation strategy, this paper presents a real-time CPU to Graphic Processing Unit reconstruction system. Based on a robust camera pose estimation strategy, the algorithm fuses all the RGB-D input values into an effective hierarchical optimization framework, and optimizes each frame according to the global camera attitude, eliminating the serious dependence on the tracking timeliness and continuously tracking globally optimized frames. The system estimates the global optimization of gestures (bundling) in real-time, supports for robust tracking recovery (re-positioning), and re-estimation of large-scale 3D scenes to ensure global consistency. It uses a set of sparse corresponding features, geometric and ray matching functions in one of the parallel optimization systems. The experimental results show that the average reconstruction time is 415 ms per frame, the ICP pose is estimated 20 times in 100.0 ms. For large scale 3D reconstruction scene, the system performs well in online reconstruction area, keeping the reconstruction accuracy at the same time.

Graphical Abstract


Machine vision 3D reconstruction Online volume fusion GPU Pose estimation 



This work is funded by National High-tech R&D Program (863 Program) (No. 2014AA7031010B), Science and Technology Project of the thirteenth Five-Year Plan (JJZ[2016]345).


  1. 1.
    Weise, T., Wismer, T., & Leibe, B., et al. (2009). In-hand scanning with online loop closure. In IEEE international conference on computer vision workshops (pp. 1630–1637).Google Scholar
  2. 2.
    Henry, P., Krainin, M., Herbst, E., et al. (2012). RGB-D mapping: Using depth cameras for dense 3d modeling of indoor environments. International Journal of Robotics Research, 31(5), 647–663.CrossRefGoogle Scholar
  3. 3.
    Keller, M., Lefloch, D., & Lambers, M., et al. (2013). Real-time 3D reconstruction in dynamic scenes using point-based fusion. In International conference on 3D Vision-3DV (Vol. 8768, Issue 2, pp. 1–8).Google Scholar
  4. 4.
    Whelan, T., Leutenegger, S., & Salas-Moreno, R. F., et al. (2015). ElasticFusion: Dense SLAM without a pose graph. Robotics: Science and Systems (RSS).Google Scholar
  5. 5.
    Merrell, P., Akbarzadeh, A., & Wang, L., et al. (2007). Real-time visibilitybased fusion of depth maps. In IEEE international conference on computer vision (Vol. 8, pp. 1–8).Google Scholar
  6. 6.
    Meilland, M., & Comport, A. (2013). On unifying key-frame and voxel-based dense visual slam at large scales. IEEE/RSJ International Conference on Intelligent Robots & Systems, 8215(2), 3677–3683.Google Scholar
  7. 7.
    Gallup, D., Pollefeys, M., & Frahm, J. M. (2010). 3D reconstruction using an n-layer heightmap. Pattern Recognition, 6376, 1–10.Google Scholar
  8. 8.
    Wurm, K. M., Hornung, A., & Bennewitz, M., et al. (2010). OctoMap: A probabilistic, flexible, and compact 3D map representation for robotic systems. In IEEE international conference on robotics and automation.Google Scholar
  9. 9.
    Curless, B., & Levoy, M. (1996). A volumetric method for building complex models from range images. In In Proceedings of SIGGRAPH. ACM (pp. 303–312).Google Scholar
  10. 10.
    Newcombe, R. A., Izadi, S., & Hilliges, O., et al. (1996). KinectFusion: Real-time dense surface mapping and tracking. In Conference on computer graphics and interactive techniques (Vol. 3, pp. 303–312).Google Scholar
  11. 11.
    Steinbruecker, F., Sturm, J., & Cremers, D. (2014). Volumetric 3D mapping in real-time on a CPU. In IEEE international conference on robotics and automation (pp. 2021–2028).Google Scholar
  12. 12.
    Zollhöfer, M., Thies, J., Colaianni, M., et al. (2014). Interactive model-based reconstruction of the human head using an RGB-D sensor. Computer Animation and Virtual Worlds, 25(25), 213–222.CrossRefGoogle Scholar
  13. 13.
    Zhou, Q. Y., & Koltun, V. (2014). Color map optimization for 3d reconstruction with consumer depth cameras. ACM Transactions on Graphics, 33(4), 1–10.Google Scholar
  14. 14.
    Choi, S., Zhou, Q. Y., & Koltun, V. (2015). Robust reconstruction of indoor scenes. In: Computer vision and pattern recognition (pp. 5556–5565).Google Scholar
  15. 15.
    Wikowski, A., Kornuta, T., Stefańczyk, M., et al. (2016). Efficient generation of 3D surfel maps using RGB-D sensors. International Journal of Applied Mathematics and Computer Science, 1, 99–122.MathSciNetMATHGoogle Scholar
  16. 16.
    Kornuta, T., & Laszkowski, M. (2016). Perception subsystem for object recognition and pose estimation in RGB-D images. Automation, Springer International Publishing, 44(10), 995–1003.Google Scholar
  17. 17.
    Whelan, T., Johannsson, H., & Kaess, M., et al. (2013). Robust real-time visual odometry for dense RGB-D mapping. In IEEE international conference on robotics and automation.Google Scholar
  18. 18.
    Lin, J. H., Wang, Y. J., & Sun, H. H. (2017). A feature-adaptive subdivision method for real-time 3D reconstruction of repeated topology surfaces. 3D Research, 8, 6. doi: 10.1007/s13319-017-0117-z.CrossRefGoogle Scholar
  19. 19.
    Qu, Y., Liu, Z., Jiang, Y., et al. (2017). Self-adaptative variable-metric feature point extraction method. Editorial Office of Optics and Precision Engineering, 25(1), 188–197. (In Chinese).CrossRefGoogle Scholar
  20. 20.
    Liu, Y., Wang, C., Gao, N., et al. (2017). Point cloud adaptive simplification of feature extraction. Editorial Office of Optics and Precision Engineering, 25(1), 245–254. (In Chinese).CrossRefGoogle Scholar
  21. 21.
    Maier, R., Sturm, J., & Cremers, D. (2014). Submap-based bundle adjustment for 3D reconstruction from RGB-D Data. In Pattern Recognition (pp. 54–65).Google Scholar
  22. 22.
    Engel, J., Schöps, T., & Cremers, D. (2014). LSD-SLAM: Large-scale direct monocular SLAM (pp. 834–849). Zurich: Springer.Google Scholar
  23. 23.
    Stückler, J., & Behnke, S. (2014). Multi-resolution surfel maps for efficient dense 3D modeling and tracking. Journal of Visual Communication and Image Representation, 25(1), 137–147.CrossRefGoogle Scholar
  24. 24.
    Nießner, M., Dai, A., & Fisher, M. (2014). Combining inertial navigation and ICP for real-time 3D Surface Reconstruction.Google Scholar
  25. 25.
    Kerl, C., Sturm, J., & Cremers, D. (2013). Dense visual SLAM for RGB-D cameras. In IEEE international conference on intelligent robots and systems (pp. 2100–2106).Google Scholar
  26. 26.
    Zhang, L., Wang, Y., Sun, H., et al. (2016). Adaptive scale object tracking with kernelized correlation filters. Editorial Office of Optics and Precision Engineering, 24(2), 448–459. (In Chinese).CrossRefGoogle Scholar
  27. 27.
    Wang, Y., Zhang, Q., & Zhou, Y. (2015). Dense 3D mapping for indoor environment based on Kinect-style depth cameras (Vol. 345, pp. 317–330). Cham: Springer.Google Scholar
  28. 28.
    Sturm, J., Engelhard, N., Endres, F., et al. (2012). A benchmark for the evaluation of RGB-D SLAM systems. In IEEE/RSJ international conference on intelligent robots and systems (pp. 573–580). IEEE.Google Scholar

Copyright information

© 3D Research Center, Kwangwoon University and Springer-Verlag Berlin Heidelberg 2017

Authors and Affiliations

  1. 1.Machinery and Electronics Engineering, Changchun Institute of Optics, Fine Mechanics and PhysicsChinese Academy of SciencesChangchunChina
  2. 2.Machinery and Electronics EngineeringChinese Academy of Sciences UniversityChangchunChina
  3. 3.Computer Application TechnologyChangchun University of TechnologyChangchunChina

Personalised recommendations