Abstract
In machine vision sensing system, it is important to realize high-quality real-time 3D reconstruction in large-scale scene. The recent online approach performed well, but scaling up the reconstruction, it causes pose estimation drift, resulting in the cumulative error, usually requiring a large number of off-line operation to completely correct the error, reducing the reconstruction performance. In order to optimize the traditional volume fusion method and improve the old frame-to-frame pose estimation strategy, this paper presents a real-time CPU to Graphic Processing Unit reconstruction system. Based on a robust camera pose estimation strategy, the algorithm fuses all the RGB-D input values into an effective hierarchical optimization framework, and optimizes each frame according to the global camera attitude, eliminating the serious dependence on the tracking timeliness and continuously tracking globally optimized frames. The system estimates the global optimization of gestures (bundling) in real-time, supports for robust tracking recovery (re-positioning), and re-estimation of large-scale 3D scenes to ensure global consistency. It uses a set of sparse corresponding features, geometric and ray matching functions in one of the parallel optimization systems. The experimental results show that the average reconstruction time is 415 ms per frame, the ICP pose is estimated 20 times in 100.0 ms. For large scale 3D reconstruction scene, the system performs well in online reconstruction area, keeping the reconstruction accuracy at the same time.
Graphical Abstract
Similar content being viewed by others
References
Weise, T., Wismer, T., & Leibe, B., et al. (2009). In-hand scanning with online loop closure. In IEEE international conference on computer vision workshops (pp. 1630–1637).
Henry, P., Krainin, M., Herbst, E., et al. (2012). RGB-D mapping: Using depth cameras for dense 3d modeling of indoor environments. International Journal of Robotics Research, 31(5), 647–663.
Keller, M., Lefloch, D., & Lambers, M., et al. (2013). Real-time 3D reconstruction in dynamic scenes using point-based fusion. In International conference on 3D Vision-3DV (Vol. 8768, Issue 2, pp. 1–8).
Whelan, T., Leutenegger, S., & Salas-Moreno, R. F., et al. (2015). ElasticFusion: Dense SLAM without a pose graph. Robotics: Science and Systems (RSS).
Merrell, P., Akbarzadeh, A., & Wang, L., et al. (2007). Real-time visibilitybased fusion of depth maps. In IEEE international conference on computer vision (Vol. 8, pp. 1–8).
Meilland, M., & Comport, A. (2013). On unifying key-frame and voxel-based dense visual slam at large scales. IEEE/RSJ International Conference on Intelligent Robots & Systems, 8215(2), 3677–3683.
Gallup, D., Pollefeys, M., & Frahm, J. M. (2010). 3D reconstruction using an n-layer heightmap. Pattern Recognition, 6376, 1–10.
Wurm, K. M., Hornung, A., & Bennewitz, M., et al. (2010). OctoMap: A probabilistic, flexible, and compact 3D map representation for robotic systems. In IEEE international conference on robotics and automation.
Curless, B., & Levoy, M. (1996). A volumetric method for building complex models from range images. In In Proceedings of SIGGRAPH. ACM (pp. 303–312).
Newcombe, R. A., Izadi, S., & Hilliges, O., et al. (1996). KinectFusion: Real-time dense surface mapping and tracking. In Conference on computer graphics and interactive techniques (Vol. 3, pp. 303–312).
Steinbruecker, F., Sturm, J., & Cremers, D. (2014). Volumetric 3D mapping in real-time on a CPU. In IEEE international conference on robotics and automation (pp. 2021–2028).
Zollhöfer, M., Thies, J., Colaianni, M., et al. (2014). Interactive model-based reconstruction of the human head using an RGB-D sensor. Computer Animation and Virtual Worlds, 25(25), 213–222.
Zhou, Q. Y., & Koltun, V. (2014). Color map optimization for 3d reconstruction with consumer depth cameras. ACM Transactions on Graphics, 33(4), 1–10.
Choi, S., Zhou, Q. Y., & Koltun, V. (2015). Robust reconstruction of indoor scenes. In: Computer vision and pattern recognition (pp. 5556–5565).
Wikowski, A., Kornuta, T., Stefańczyk, M., et al. (2016). Efficient generation of 3D surfel maps using RGB-D sensors. International Journal of Applied Mathematics and Computer Science, 1, 99–122.
Kornuta, T., & Laszkowski, M. (2016). Perception subsystem for object recognition and pose estimation in RGB-D images. Automation, Springer International Publishing, 44(10), 995–1003.
Whelan, T., Johannsson, H., & Kaess, M., et al. (2013). Robust real-time visual odometry for dense RGB-D mapping. In IEEE international conference on robotics and automation.
Lin, J. H., Wang, Y. J., & Sun, H. H. (2017). A feature-adaptive subdivision method for real-time 3D reconstruction of repeated topology surfaces. 3D Research, 8, 6. doi:10.1007/s13319-017-0117-z.
Qu, Y., Liu, Z., Jiang, Y., et al. (2017). Self-adaptative variable-metric feature point extraction method. Editorial Office of Optics and Precision Engineering, 25(1), 188–197. (In Chinese).
Liu, Y., Wang, C., Gao, N., et al. (2017). Point cloud adaptive simplification of feature extraction. Editorial Office of Optics and Precision Engineering, 25(1), 245–254. (In Chinese).
Maier, R., Sturm, J., & Cremers, D. (2014). Submap-based bundle adjustment for 3D reconstruction from RGB-D Data. In Pattern Recognition (pp. 54–65).
Engel, J., Schöps, T., & Cremers, D. (2014). LSD-SLAM: Large-scale direct monocular SLAM (pp. 834–849). Zurich: Springer.
Stückler, J., & Behnke, S. (2014). Multi-resolution surfel maps for efficient dense 3D modeling and tracking. Journal of Visual Communication and Image Representation, 25(1), 137–147.
Nießner, M., Dai, A., & Fisher, M. (2014). Combining inertial navigation and ICP for real-time 3D Surface Reconstruction.
Kerl, C., Sturm, J., & Cremers, D. (2013). Dense visual SLAM for RGB-D cameras. In IEEE international conference on intelligent robots and systems (pp. 2100–2106).
Zhang, L., Wang, Y., Sun, H., et al. (2016). Adaptive scale object tracking with kernelized correlation filters. Editorial Office of Optics and Precision Engineering, 24(2), 448–459. (In Chinese).
Wang, Y., Zhang, Q., & Zhou, Y. (2015). Dense 3D mapping for indoor environment based on Kinect-style depth cameras (Vol. 345, pp. 317–330). Cham: Springer.
Sturm, J., Engelhard, N., Endres, F., et al. (2012). A benchmark for the evaluation of RGB-D SLAM systems. In IEEE/RSJ international conference on intelligent robots and systems (pp. 573–580). IEEE.
Acknowledgements
This work is funded by National High-tech R&D Program (863 Program) (No. 2014AA7031010B), Science and Technology Project of the thirteenth Five-Year Plan (JJZ[2016]345).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lin, Jh., Wang, L. & Wang, Yj. A Hierarchical Optimization Algorithm Based on GPU for Real-Time 3D Reconstruction. 3D Res 8, 16 (2017). https://doi.org/10.1007/s13319-017-0127-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13319-017-0127-x