Skip to main content

SDF-2-SDF Registration for Real-Time 3D Reconstruction from RGB-D Data

Abstract

We tackle the task of dense 3D reconstruction from RGB-D data. Contrary to the majority of existing methods, we focus not only on trajectory estimation accuracy, but also on reconstruction precision. The key technique is SDF-2-SDF registration, which is a correspondence-free, symmetric, dense energy minimization method, performed via the direct voxel-wise difference between a pair of signed distance fields. It has a wider convergence basin than traditional point cloud registration and cloud-to-volume alignment techniques. Furthermore, its formulation allows for straightforward incorporation of photometric and additional geometric constraints. We employ SDF-2-SDF registration in two applications. First, we perform small-to-medium scale object reconstruction entirely on the CPU. To this end, the camera is tracked frame-to-frame in real time. Then, the initial pose estimates are refined globally in a lightweight optimization framework, which does not involve a pose graph. We combine these procedures into our second, fully real-time application for larger-scale object reconstruction and SLAM. It is implemented as a hybrid system, whereby tracking is done on the GPU, while refinement runs concurrently over batches on the CPU. To bound memory and runtime footprints, registration is done over a fixed number of limited-extent volumes, anchored at geometry-rich locations. Extensive qualitative and quantitative evaluation of both trajectory accuracy and model fidelity on several public RGB-D datasets, acquired with various quality sensors, demonstrates higher precision than related techniques.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

References

  1. Adalsteinsson, D., & Sethian, J. A. (1995). A fast level set method for propagating interfaces. Journal of Computational Physics, 118(2), 269–277.

    MathSciNet  Article  MATH  Google Scholar 

  2. Alexandre, L. A. (2012). 3D descriptors for object and category recognition: A comparative evaluation. In Workshop on Color-Depth Camera Fusion in Robotics at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

  3. Besl, P. J., & McKay, N. D. (1992). A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 14(2), 239–256.

    Article  Google Scholar 

  4. Blender Project: Free and open 3D creation software. https://www.blender.org/. Last Accessed March 30, 2017.

  5. Bo, L., Ren, X., & Fox, D. (2011). Depth Kernel descriptors for object recognition. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

  6. Bylow, E., Olsson, C., & Kahl, F. (2014). Robust camera tracking by combining color and depth measurements. In International Conference on Pattern Recognition (ICPR).

  7. Bylow, E., Sturm, J., Kerl, C., Kahl, F., & Cremers, D. (2013). Real-time camera tracking and 3D reconstruction using signed distance functions. In Robotics: Science and Systems Conference (RSS).

  8. Canelhas, D. (2017). sdf_tracker - ROS Wiki. http://wiki.ros.org/sdf_tracker. Last Accessed March 30, 2017.

  9. Canelhas, D. R., Stoyanov, T., & Lilienthal, A. J. (2013). SDF Tracker: A parallel algorithm for on-line pose estimation and scene reconstruction from depth images. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

  10. Chen, Y., & Medioni, G. (1991). Object modeling by registration of multiple range images. In IEEE International Conference on Robotics and Automation (ICRA).

  11. Chen, J., Bautembach, D., & Izadi, S. (2013). Scalable real-time volumetric surface reconstruction. ACM Transactions on Graphics, 32(4), 113.

    MATH  Google Scholar 

  12. Choi, S., Zhou, Q. Y., & Koltun, V. (2015) Robust reconstruction of indoor scenes. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  13. Choi, S., Zhou, Q., Miller, S., & Koltun, V. (2016). A large dataset of object scans. arXiv:1602.02481.

  14. Clarenz, U., Rumpf, M., & Telea, A. (2004). Robust feature detection and local classification for surfaces based on moment analysis. IEEE Transactions on Visualization and Computer Graphics, 10(5), 516–524.

    Article  Google Scholar 

  15. CloudCompare: 3D point cloud and mesh processing software. http://www.danielgm.net/cc/. Last Accessed March 30, 2017.

  16. Curless, B., & Levoy, M. (1996). A volumetric method for building complex models from range images. In 23rd Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’96, pp. 303–312.

  17. Dimashova, M., Lysenkov, I., Rabaud, V., & Eruhimov, V. (2013). Tabletop object scanning with an RGB-D sensor. In Third Workshop on Semantic Perception, Mapping and Exploration (SPME) at the 2013 IEEE International Conference on Robotics and Automation (ICRA).

  18. Drost, B., Ulrich, M., Navab, N., & Ilic, S. (2010) Model globally, match locally: Efficient and robust 3D object recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  19. Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., & Burgard, W. (2012). An evaluation of the RGB-D SLAM system. In IEEE International Conference on Robotics and Automation (ICRA).

  20. Fioraio, N., Taylor, J., Fitzgibbon, A., Di Stefano, L., & Izadi, S. (2015). Large-scale and drift-free surface reconstruction using online subvolume registration. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  21. Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.

    MathSciNet  Article  Google Scholar 

  22. Gelfand, N., Mitra, N. J., Guibas, L. J., & Pottmann, H. (2005). Robust global registration. In Third Eurographics Symposium on Geometry Processing (SGP).

  23. Henry, P., Fox, D., Bhowmik, A., & Mongia, R. (2013). Patch volumes: Segmentation-based consistent mapping with RGB-D cameras. In International Conference on 3D Vision (3DV).

  24. Henry, P., Krainin, M., Herbst, E., Ren, X., & Fox, D. (2010). RGB-D mapping: Using depth cameras for dense 3D modeling of indoor environments. In International Symposium on Experimental Robotics.

  25. Holzer, S., Shotton, J., & Kohli, P. (2012). Learning to efficiently detect repeatable interest points in depth data. In European Conference on Computer Vision (ECCV).

  26. Houston, B., Nielsen, M. B., Batty, C., Nilsson, O., & Museth, K. (2006). Hierarchical RLE level set: A compact and versatile deformable surface representation. ACM Transactions on Graphics (TOG), 25(1), 151–175.

    Article  Google Scholar 

  27. Ioannou, Y., Taati, B., Harrap, R., & Greenspan, M. A. (2012). Difference of normals as a multi-scale operator in unorganized point clouds. In Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission (3DIMPVT).

  28. Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., & Fitzgibbon, A. (2011). KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. In ACM Symposium on User Interface Software and Technology (UIST).

  29. Johnson, A. E., & Hebert, M. (1999). Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 21(5), 433–449.

    Article  Google Scholar 

  30. Johnson, A., & Kang, S. B. (1999). Registration and integration of textured 3D Data. Image and Vision Computing, 17, 135–147.

    Article  Google Scholar 

  31. Kähler, O., Prisacariu, V. A., Ren, C. Y., Sun, X., Torr, P., & Murray, D. (2015). Very high frame rate volumetric integration of depth images on mobile devices. IEEE Transactions on Visualization and Computer Graphics (TVCG), 21(11), 1241–1250.

    Article  Google Scholar 

  32. Kehl, W., Holl, T., Tombari, F., Ilic, S., & Navab, N. (2016). An Octree-based approach towards efficient variational range data fusion. In British Machine Vision Conference (BMVC).

  33. Kehl, W., Navab, N., & Ilic, S. (2014). Coloured signed distance fields for full 3D object reconstruction. In Proceedings of the British Machine Vision Conference (BMVC).

  34. Keller, M., Lefloch, D., Lambers, M., Izadi, S., Weyrich, T., & Kolb, A. (2013). Real-time 3D reconstruction in dynamic scenes using point-based fusion. In 2013 International Conference on 3D Vision (3DV).

  35. Kerl, C. (2017). GitHub—tum-vision/dvo: Dense Visual Odometry. https://github.com/tum-vision/dvo. Last accessed March 30, 2017.

  36. Kerl, C., Sturm, J., & Cremers, D. (2013). Robust odometry estimation for RGB-D cameras. In IEEE International Conference on Robotics and Automation (ICRA).

  37. Khoshelham, K., & Elberink, S. O. (2012). Accuracy and resolution of kinect depth data for indoor mapping applications. Sensors, 12(2), 1437–1454.

    Article  Google Scholar 

  38. KinectFusion implementation in the point cloud library (PCL). https://github.com/PointCloudLibrary/pcl/tree/master/gpu/kinfu. Last Accessed March 30, 2017.

  39. Klein, G., & Murray, D. (2007). Parallel tracking and mapping for small AR workspaces. In Proceedings of the Sixth IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR).

  40. Kümmerle, R., Grisetti, G., Strasdat, H., Konolige, K., & Burgard, W. (2011). g2o: A general framework for graph optimization. In IEEE International Conference on Robotics and Automation (ICRA).

  41. Lai, K., Bo, L., Ren, X., & Fox, D. (2011). A large-scale hierarchical multi-view RGB-D object dataset. In International Conference on Robotics and Automation (ICRA).

  42. Lorensen, W. E., & Cline, H. E. (1987). Marching cubes: a high resolution 3D surface construction algorithm. In 14th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’87.

  43. Losasso, F., Fedkiw, R., & Osher, S. (2006). Spatially adaptive techniques for level set methods and incompressible flow. Computers and Fluids, 35(10), 995–1010.

    MathSciNet  Article  MATH  Google Scholar 

  44. Ma, Y., Soatto, S., Kosecka, J., & Sastry, S. S. (2003). An invitation to 3-D vision: From images to geometric models. Berlin: Springer.

    MATH  Google Scholar 

  45. Masuda, T. (2002). Registration and integration of multiple range images by matching signed distance fields for object shape modeling. Computer Vision and Image Understanding (CVIU), 87(1–3), 51–65.

    Article  MATH  Google Scholar 

  46. Meilland, M., & Comport, A. I. (2013). On unifying key-frame and Voxel-based dense visual SLAM at large scales. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

  47. Narayan, K. S., Sha, J., Singh, A., & Abbeel, P. (2015). Range sensor and silhouette fusion for high-quality 3D scanning. In IEEE International Conference on Robotics and Automation (ICRA).

  48. Neubeck, A., & Van Gool, L. (2006). Efficient non-maximum suppression. In 18th International Conference on Pattern Recognition (ICPR).

  49. Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., Kohli, P., Shotton, J., Hodges, S., & Fitzgibbon, A. (2011). KinectFusion: Real-time dense surface mapping and tracking. In 10th International Symposium on Mixed and Augmented Reality (ISMAR).

  50. Newcombe, R. A., Lovegrove, S. J., & Davison, A. J. (2011). DTAM: Dense tracking and mapping in real-time. In IEEE International Conference on Computer Vision (ICCV).

  51. Nielsen, M. B., & Museth, K. (2006). Dynamic tubular grid: An efficient data structure and algorithms for high resolution level sets. Journal of Scientific Computing, 26(3), 261–299.

    MathSciNet  Article  MATH  Google Scholar 

  52. Nießner, M., Zollhöfer, M., Izadi, S., & Stamminger, M. (2013). Real-time 3D reconstruction at scale using voxel hashing. ACM Transactions on Graphics (TOG), 32, 169.

    Google Scholar 

  53. Osher, S., & Fedkiw, R. (2003). Level set methods and dynamic implicit surfaces. Applied mathematical science (Vol. 153). Springer.

  54. Pirker, K., Rüther, M., Schweighofer, G., & Bischof, H. (2011). GPSlam: Marrying sparse geometric and dense probabilistic visual mapping. In Proceedings of the British Machine Vision Conference (BMVC).

  55. Point Cloud Library. http://pointclouds.org/. Last Accessed March 30, 2017.

  56. Ren, C. Y., & Reid, I. (2012). A unified energy minimization framework for model fitting in depth. In European Conference on Computer Vision 2nd Workshop on Consumer Depth Cameras (ECCVW).

  57. Roth, H., & Vona, M. (2012). Moving volume KinectFusion. In British Machine Vision Conference (BMVC).

  58. Rusinkiewicz, S., & Levoy, M. (2001). Efficient variants of the ICP algorithm. In 3rd International Conference on 3D Digital Imaging and Modeling (3DIM).

  59. Rusu, R. B., Holzbach, A., Blodow, N., & Beetz, M. (2009). Fast geometric point labeling using conditional random fields. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

  60. Schütz, C., Jost, T., & Hugli, H. (1998). Multi-feature matching algorithm for free-form 3D surface registration. In International Conference on Pattern Recognition (ICPR).

  61. Segal, A., Haehnel, D., & Thrun, S. (2009). Generalized-ICP. In Robotics: Science and Systems (RSS).

  62. Singh, A., Sha, J., Narayan, K., Achim, T., & Abbeel, P. (2014) BigBIRD: A large-scale 3D database of object instances. In IEEE International Conference on Robotics and Automation (ICRA).

  63. Slavcheva, M., & Ilic, S. (2016). SDF-TAR: Parallel tracking and refinement in RGB-D data using volumetric registration. In British Machine Vision Conference (BMVC).

  64. Slavcheva, M., Kehl, W., Navab, N., & Ilic, S. (2016). SDF-2-SDF: Highly accurate 3D object reconstruction. In European Conference on Computer Vision (ECCV).

  65. Steder, B., Rusu, R. B., Konolige, K., & Burgard, W. (2010). NARF: 3D range image features for object recognition. In Workshop on Defining and Solving Realistic Perception Problems in Personal Robotics at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

  66. Steinbrücker, F., Kerl, C., Sturm, J., & Cremers, D. (2013). Large-scale multi-resolution surface reconstruction from RGB-D sequences. In IEEE International Conference on Computer Vision (ICCV).

  67. Steinbrücker, F., Sturm, J., & Cremers, D. (2014). Volumetric 3D mapping in real-time on a CPU. In IEEE International Conference on Robotics and Automation (ICRA).

  68. Sturm, J., Engelhard, N., Endres, F., Burgard, W., & Cremers, D. (2017). A benchmark for the evaluation of RGB-D SLAM systems. In Proceedings of the International Conference on Intelligent Robot Systems (IROS).

  69. Tomasi, C., & Manduchi, R. (1998). Bilateral filtering for gray and color images. In Sixth IEEE International Conference on Computer Vision (ICCV), pp. 839–846.

  70. Tombari, F., Salti, S., & Di Stefano, L. (2013). Performance evaluation of 3D keypoint detectors. International Journal of Computer Vision (IJCV), 102(1), 198–220.

    Article  Google Scholar 

  71. Vijayanagar, K. R., Loghman, M., & Kim, J. (2014). Real-time refinement of Kinect depth maps using multi-resolution anisotropic diffusion. Mobile Networks and Applications, 19(3), 414–425.

    Article  Google Scholar 

  72. Wasenmüller, O., Ansari, M., & Stricker, D. (2016) DNA-SLAM: Dense noise aware SLAM for ToF RGB-D cameras. In Asian Conference on Computer Vision (ACCV) International Workshops.

  73. Wasenmüller, O., Meyer, M., & Stricker, D. (2016). CoRBS: Comprehensive RGB-D benchmark for SLAM using Kinect v2. In IEEE Winter Conference on Applications of Computer Vision (WACV).

  74. Whelan, T., Johannsson, H., Kaess, M., Leonard, J. J., & McDonald, J. B. (2013). Robust real-time visual odometry for dense RGB-D mapping. In IEEE International Conference on Robotics and Automation (ICRA).

  75. Whelan, T., Kaess, M., Leonard, J. J., & McDonald, J. (2013). Deformation-based loop closure for large scale dense rgb-d slam. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

  76. Whelan, T., Leutenegger, S., Salas-Moreno, R. F., Glocker, B., & Davison, A.J. (2015). ElasticFusion: Dense SLAM without a pose graph. In Robotics: Science and Systems (RSS).

  77. Whelan, T., McDonald, J. B., Kaess, M., Fallon, M. F., Johannsson, H., & Leonard, J. J. (2012). Kintinuous: Spatially extended KinectFusion. In RSS Workshop on RGB-D: Advanced Reasoning with Depth Cameras.

  78. Whelan, T., Salas-Moreno, R. F., Glocker, B., Davison, A. J., & Leutenegger, S. (2016). ElasticFusion: Real-time dense SLAM and light source estimation. International Journal of Robotics Research (IJRR), 35(14), 1697–1716.

    Article  Google Scholar 

  79. Whitaker, R. T. (1998). A level-set approach to 3D reconstruction from range data. International Journal of Computer Vision (IJCV), 29(3), 203–231.

    Article  Google Scholar 

  80. Zach, C., Pock, T., & Bischof, H. (2007). A globally optimal algorithm for robust TV-\(L^1\) range image integration. In Proceedings of the 11th IEEE International Conference on Computer Vision (ICCV), pp. 1–8.

  81. ZCorporation: ZPrinter 650. Hardware manual (2008).

  82. Zeng, M., Zhao, F., Zheng, J., & Liu, X. (2013). Octree-based fusion for realtime 3D reconstruction. Graphical Models, 75(3), 126–136.

    Article  Google Scholar 

  83. Zhou, Q., Miller, S., & Koltun, V. (2013). Elastic fragments for dense scene reconstruction. In IEEE International Conference on Computer Vision (ICCV).

  84. Zhou, Q., & Koltun, V. (2013). Dense scene reconstruction with points of interest. ACM Transactions on Graphics, 32(4), 112.

    MATH  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Miroslava Slavcheva.

Additional information

Communicated by Michael S. Brown.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (avi 23636 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Slavcheva, M., Kehl, W., Navab, N. et al. SDF-2-SDF Registration for Real-Time 3D Reconstruction from RGB-D Data. Int J Comput Vis 126, 615–636 (2018). https://doi.org/10.1007/s11263-017-1057-z

Download citation

Keywords

  • Signed distance field
  • Registration
  • 3D reconstruction
  • Camera tracking
  • Global optimization
  • RGB-D sensors