Skip to main content

Refining Geometry from Depth Sensors using IR Shading Images

Abstract

We propose a method to refine geometry of 3D meshes from a consumer level depth camera, e.g. Kinect, by exploiting shading cues captured from an infrared (IR) camera. A major benefit to using an IR camera instead of an RGB camera is that the IR images captured are narrow band images that filter out most undesired ambient light, which makes our system robust against natural indoor illumination. Moreover, for many natural objects with colorful textures in the visible spectrum, the subjects appear to have a uniform albedo in the IR spectrum. Based on our analyses on the IR projector light of the Kinect, we define a near light source IR shading model that describes the captured intensity as a function of surface normals, albedo, lighting direction, and distance between light source and surface points. To resolve the ambiguity in our model between the normals and distances, we utilize an initial 3D mesh from the Kinect fusion and multi-view information to reliably estimate surface details that were not captured and reconstructed by the Kinect fusion. Our approach directly operates on the mesh model for geometry refinement. We ran experiments on our algorithm for geometries captured by both the Kinect I and Kinect II, as the depth acquisition in Kinect I is based on a structured-light technique and that of the Kinect II is based on a time-of-flight technology. The effectiveness of our approach is demonstrated through several challenging real-world examples. We have also performed a user study to evaluate the quality of the mesh models before and after our refinements.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Notes

  1. 1.

    http://www.microsoft.com/en-us/kinectforwindows/.

  2. 2.

    Strictly speaking, it captures a range between 800 nm and 2500 nm, which belong to the near infrared band. For simplicity, we abbreviate the band as the IR band in this paper.

  3. 3.

    The IR image is radiometrically calibrated.

  4. 4.

    SparseLM: Sparse Levenberg-Marquardt nonlinear least squares http://users.ics.forth.gr/~lourakis/sparseLM/.

  5. 5.

    Kinect IR projector is hard-wired and cannot be modified.

References

  1. Bellia, L., Bisegna, F., & Spada, G. (2011). Lighting in indoor environments: Visual and non-visual effects of light sources with different spectral power distributions. Building and Environment, 46(10), 1984–1992.

    Article  Google Scholar 

  2. Besl, P., & McKay, N. D. (1992). A method for registration of 3-d shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 14(2), 239–256.

    Article  Google Scholar 

  3. Bohme, M., Haker, M., Martinetz, T., & Barth, E. (2010). Shading constraint improves accuracy of time-of-flight measurements. Computer Vision and Image Understanding (CVIU), 114(12), 1329–1335.

    Article  Google Scholar 

  4. Boykov, Y., & Kolmogorov, V. (2004). An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 26(9), 1124–1137.

    Article  MATH  Google Scholar 

  5. Choe, G., Narasimhan, S. G., & Kweon, I. S. (2016). Simultaneous estimation of near IR BRDF and fine-scale surface geometry. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2452–2460).

  6. Chatterjee, A., & Madhav Govindu, V. (2015). Photometric refinement of depth maps for multi-albedo objects. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 933–941).

  7. Choe, G., Park, J., Tai, Y. W., & Kweon, I. S. (2014). Exploiting shading cues in kinect ir images for geometry refinement. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3922–3929).

  8. Delaunoy, A., & Pollefeys, M. (2014). Photometric bundle adjustment for dense multi-view 3d modeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  9. Dolson, J., Baek, J., Plagemann, C., & Thrun, S. (2010). Upsampling range data in dynamic environments. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  10. Fanello, S. R., Keskin, C., Izadi, S., Kohli, P., Kim, D., Sweeney, D., et al. (2014). Learning to be a depth camera for close-range human capture and interaction. ACM Transactions on Graphics (TOG), 33(4), 86.

    Article  Google Scholar 

  11. Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.

    MathSciNet  Article  Google Scholar 

  12. Grossberg, M., & Nayar, S. (2004). Modeling the space of camera response functions. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 26(10), 1272–1282.

    Article  Google Scholar 

  13. Han, Y., Lee, J. Y., & Kweon, I. S. (2013). High quality shape from a single rgb-d image under uncalibrated natural illumination. In Proceedings of the IEEE Conference on Computer Vision (ICCV)

  14. Haque, S., Chatterjee, A., & Govindu,V. (2014). High quality photometric reconstruction using a depth camera. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  15. Hernandez, C., Vogiatzis, G., & Cipolla, R. (2008). Multiview photometric stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 30(3), 548–554.

    Article  Google Scholar 

  16. Higo, T., Matsushita, Y., Joshi, N., & Ikeuchi, K. (2009). A hand-held photometric stereo camera for 3-d modeling. In Proceedings of the IEEE Conference on Computer Vision (ICCV) (pp. 1234–1241). IEEE.

  17. Horn, B. K. P., & Brooks, M. J. (1989). Shape from shading. Cambridge, MA: MIT Press.

    MATH  Google Scholar 

  18. Horn, B. K. P., & Woodham, J. R. (1978). Determining shape and reflectance using multiple images. In MIT AI Memo

  19. Ikeuchi, K., & Horn, B. K. (1981). Numerical shape from shading and occluding boundaries. Artificial Intelligence, 17(13), 141–184.

    Article  Google Scholar 

  20. Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., et al. (2011). Kinectfusion: Real-time 3d reconstruction and interaction using a moving depth camera. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology

  21. Kanungo, T., Mount, D., Netanyahu, N., Piatko, C., & Silverman, R. (2002). An efficient k-means clustering algorithm: Analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 24(7), 881–892.

    Article  Google Scholar 

  22. Kerl, C., Sturm, J., & Cremers, D. (2013). Dense visual slam for rgb-d cameras. In Proceedings of the International Conference on Intelligent Robots and Systems (IROS)

  23. Kopf, J., Cohen, M. F., Lischinski, D., & Uyttendaele, M. (2007). Joint bilateral upsampling. ACM Transactions on Graphics (TOG), 26(3), 96.

    Article  Google Scholar 

  24. Lensch, H., Kautz, J., Goesele, M., Heidrich, W., & Seidel, H. P. (2003). Image-based reconstruction of spatial appearance and geometric detail. ACM Transactions on Graphics (TOG), 22(2), 234–257.

    Article  Google Scholar 

  25. Leyvand, T., Meekhof, C., Wei, Y., Sun, J., & Guo, B. (2011). Kinect identity: Technology and experience. IEEE Computer, 44(4), 94–96.

    Article  Google Scholar 

  26. Liao, M., Wang, L., Yang, R., & Gong, M. (2007). Light fall-off stereo. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1–8). IEEE

  27. Longuet-Higgins, H. C. (1981). A computer algorithm for reconstructing a scene from two projections. Nature, 193, 133–135.

    Article  Google Scholar 

  28. Lu, Z., Tai, Y. W., Ben-Ezra, M., & Brown, M.S. (2010). A framework for ultra high resolution 3d imaging. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  29. Nehab, D., Rusinkiewicz, S., Davis, J., & Ramamoorthi, R. (2005). Efficiently combining positions and normals for precise 3d geometry. ACM Transactions on Graphics (TOG), 24(3), 536–543.

    Article  Google Scholar 

  30. Okatani, T., & Deguchi, K. (2012). Optimal integration of photometric and geometric surface measurements using inaccurate reflectance/illumination knowledge. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  31. Or-El, R., Rosman, G., Wetzler, A., Kimmel, R., & Bruckstein, A. M. (2015). Rgbd-fusion: Real-time high precision depth recovery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 5407–5416).

  32. Park, J., Kim, H., Tai, Y. W., Brown, M. S., & Kweon, I. S. (2011). High quality depth map upsampling for 3d-tof cameras. In Proceedings of the IEEE Conference on Computer Vision (ICCV).

  33. Park, J., Kim, H., Tai, Y. W., Brown, M. S., & Kweon, I. S. (2014). High-quality depth map upsampling and completion for rgb-d cameras. IEEE Transactions on Image Processing (TIP), 23(12), 5559–5572.

    Article  Google Scholar 

  34. Park, J., Sinha, S. N., Matsushita, Y., Tai, Y. W., & Kweon, I. S. (2013). Multiview photometric stereo using planar mesh parameterization. In Proceedings of the IEEE Conference on Computer Vision (ICCV).

  35. Salamati, N., Fredembach, C., & Süsstrunk, S. (2009). Material classification using color and nir images. In Proc. of IS&T/SID 17th Color Imaging Conference (CIC).

  36. Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., & Szeliski, R. (2006). A comparison and evaluation of multi-view stereo reconstruction algorithms. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  37. Shan, Q., Adams, R., Curless, B., Furukawa, Y., & Seitz, S. M. (2013). The visual turing test for scene reconstruction. In Proceedings of the International Conference on 3D Vision (3DV)

  38. Shen, J., & Cheung, S. C. S. (2013). Layer depth denoising and completion for structured-light rgb-d cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  39. Shi, B., Inose, K., Matsushita, Y., Tan, P., Yeung, S. K., & Ikeuchi, K. (2014). Photometric stereo using internet images. In Proceedings of the International Conference on 3D Vision (3DV).

  40. Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., et al. (2011). Real-time human pose recognition in parts from a single depth image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  41. Surazhsky, V., & Gotsman, C. (2003). Explicit surface remeshing. In Proceedings of the 2003 Eurographics/ACM SIGGRAPH symposium on Geometry processing.

  42. Suwajanakorn, S., Kemelmacher-Shlizerman, I., & Seitz, S. M. (2014). Total moving face reconstruction. In Proceedings of the European Conference on Computer Vision (ECCV).

  43. Vlasic, D., Peers, P., Baran, I., Debevec, P., Popović, J., Rusinkiewicz, S., et al. (2009). Dynamic shape capture using multi-view photometric stereo. ACM Transactions on Graphics (TOG), 28(5), 174.

    Article  Google Scholar 

  44. Wu, C., Wilburn, B., Matsushita, Y., & Theobalt, C. (2011). High-quality shape from multi-view stereo and shading under general illumination. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  45. Wu, C., Zollhöfer, M., Niessner, M., Stamminger, M., Izadi, S., & Theobalt, C. (2014). Real-time shading-based refinement for consumer depth cameras. In Proc. SIGGRAPH Asia

  46. Yang, Q., Yang, R., Davis, J., & Nistér, D. (2007). Spatial-depth super resolution for range images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  47. Yu, L. F., Yeung, S. K., Tai, Y. W., & Lin, S. (2013). Shading-based shape refinement of rgb-d images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  48. Zhang, Q., Ye, M., Yang, R., Matsushita, Y., Wilburn, B., & Yu, H. (2012). Edge-preserving photometric stereo via depth fusion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  49. Zhou, Q. Y., & Koltun, V. (2014). Color map optimization for 3d reconstruction with consumer depth cameras. ACM Transactions on Graphics (TOG), 33(4), 155.

    Google Scholar 

  50. Zollhöfer, M., Dai, A., Innmann, M., Wu, C., Stamminger, M., Theobalt, C., et al. (2015). Shading-based refinement on volumetric signed distance functions. ACM Transactions on Graphics (TOG), 34(4), 96.

    Article  MATH  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to In So Kweon.

Additional information

Communicated by Yasuhiro Mukaigawa.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Choe, G., Park, J., Tai, YW. et al. Refining Geometry from Depth Sensors using IR Shading Images. Int J Comput Vis 122, 1–16 (2017). https://doi.org/10.1007/s11263-016-0937-y

Download citation

Keywords

  • RGB-D sensor
  • Kinect
  • Infrared
  • IR
  • Geometry refinement
  • Shading image
  • Shape from shading