Abstract
We propose a method to refine geometry of 3D meshes from a consumer level depth camera, e.g. Kinect, by exploiting shading cues captured from an infrared (IR) camera. A major benefit to using an IR camera instead of an RGB camera is that the IR images captured are narrow band images that filter out most undesired ambient light, which makes our system robust against natural indoor illumination. Moreover, for many natural objects with colorful textures in the visible spectrum, the subjects appear to have a uniform albedo in the IR spectrum. Based on our analyses on the IR projector light of the Kinect, we define a near light source IR shading model that describes the captured intensity as a function of surface normals, albedo, lighting direction, and distance between light source and surface points. To resolve the ambiguity in our model between the normals and distances, we utilize an initial 3D mesh from the Kinect fusion and multi-view information to reliably estimate surface details that were not captured and reconstructed by the Kinect fusion. Our approach directly operates on the mesh model for geometry refinement. We ran experiments on our algorithm for geometries captured by both the Kinect I and Kinect II, as the depth acquisition in Kinect I is based on a structured-light technique and that of the Kinect II is based on a time-of-flight technology. The effectiveness of our approach is demonstrated through several challenging real-world examples. We have also performed a user study to evaluate the quality of the mesh models before and after our refinements.
Similar content being viewed by others
Notes
Strictly speaking, it captures a range between 800 nm and 2500 nm, which belong to the near infrared band. For simplicity, we abbreviate the band as the IR band in this paper.
The IR image is radiometrically calibrated.
SparseLM: Sparse Levenberg-Marquardt nonlinear least squares http://users.ics.forth.gr/~lourakis/sparseLM/.
Kinect IR projector is hard-wired and cannot be modified.
References
Bellia, L., Bisegna, F., & Spada, G. (2011). Lighting in indoor environments: Visual and non-visual effects of light sources with different spectral power distributions. Building and Environment, 46(10), 1984–1992.
Besl, P., & McKay, N. D. (1992). A method for registration of 3-d shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 14(2), 239–256.
Bohme, M., Haker, M., Martinetz, T., & Barth, E. (2010). Shading constraint improves accuracy of time-of-flight measurements. Computer Vision and Image Understanding (CVIU), 114(12), 1329–1335.
Boykov, Y., & Kolmogorov, V. (2004). An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 26(9), 1124–1137.
Choe, G., Narasimhan, S. G., & Kweon, I. S. (2016). Simultaneous estimation of near IR BRDF and fine-scale surface geometry. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2452–2460).
Chatterjee, A., & Madhav Govindu, V. (2015). Photometric refinement of depth maps for multi-albedo objects. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 933–941).
Choe, G., Park, J., Tai, Y. W., & Kweon, I. S. (2014). Exploiting shading cues in kinect ir images for geometry refinement. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3922–3929).
Delaunoy, A., & Pollefeys, M. (2014). Photometric bundle adjustment for dense multi-view 3d modeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Dolson, J., Baek, J., Plagemann, C., & Thrun, S. (2010). Upsampling range data in dynamic environments. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Fanello, S. R., Keskin, C., Izadi, S., Kohli, P., Kim, D., Sweeney, D., et al. (2014). Learning to be a depth camera for close-range human capture and interaction. ACM Transactions on Graphics (TOG), 33(4), 86.
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.
Grossberg, M., & Nayar, S. (2004). Modeling the space of camera response functions. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 26(10), 1272–1282.
Han, Y., Lee, J. Y., & Kweon, I. S. (2013). High quality shape from a single rgb-d image under uncalibrated natural illumination. In Proceedings of the IEEE Conference on Computer Vision (ICCV)
Haque, S., Chatterjee, A., & Govindu,V. (2014). High quality photometric reconstruction using a depth camera. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Hernandez, C., Vogiatzis, G., & Cipolla, R. (2008). Multiview photometric stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 30(3), 548–554.
Higo, T., Matsushita, Y., Joshi, N., & Ikeuchi, K. (2009). A hand-held photometric stereo camera for 3-d modeling. In Proceedings of the IEEE Conference on Computer Vision (ICCV) (pp. 1234–1241). IEEE.
Horn, B. K. P., & Brooks, M. J. (1989). Shape from shading. Cambridge, MA: MIT Press.
Horn, B. K. P., & Woodham, J. R. (1978). Determining shape and reflectance using multiple images. In MIT AI Memo
Ikeuchi, K., & Horn, B. K. (1981). Numerical shape from shading and occluding boundaries. Artificial Intelligence, 17(13), 141–184.
Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., et al. (2011). Kinectfusion: Real-time 3d reconstruction and interaction using a moving depth camera. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology
Kanungo, T., Mount, D., Netanyahu, N., Piatko, C., & Silverman, R. (2002). An efficient k-means clustering algorithm: Analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 24(7), 881–892.
Kerl, C., Sturm, J., & Cremers, D. (2013). Dense visual slam for rgb-d cameras. In Proceedings of the International Conference on Intelligent Robots and Systems (IROS)
Kopf, J., Cohen, M. F., Lischinski, D., & Uyttendaele, M. (2007). Joint bilateral upsampling. ACM Transactions on Graphics (TOG), 26(3), 96.
Lensch, H., Kautz, J., Goesele, M., Heidrich, W., & Seidel, H. P. (2003). Image-based reconstruction of spatial appearance and geometric detail. ACM Transactions on Graphics (TOG), 22(2), 234–257.
Leyvand, T., Meekhof, C., Wei, Y., Sun, J., & Guo, B. (2011). Kinect identity: Technology and experience. IEEE Computer, 44(4), 94–96.
Liao, M., Wang, L., Yang, R., & Gong, M. (2007). Light fall-off stereo. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1–8). IEEE
Longuet-Higgins, H. C. (1981). A computer algorithm for reconstructing a scene from two projections. Nature, 193, 133–135.
Lu, Z., Tai, Y. W., Ben-Ezra, M., & Brown, M.S. (2010). A framework for ultra high resolution 3d imaging. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Nehab, D., Rusinkiewicz, S., Davis, J., & Ramamoorthi, R. (2005). Efficiently combining positions and normals for precise 3d geometry. ACM Transactions on Graphics (TOG), 24(3), 536–543.
Okatani, T., & Deguchi, K. (2012). Optimal integration of photometric and geometric surface measurements using inaccurate reflectance/illumination knowledge. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Or-El, R., Rosman, G., Wetzler, A., Kimmel, R., & Bruckstein, A. M. (2015). Rgbd-fusion: Real-time high precision depth recovery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 5407–5416).
Park, J., Kim, H., Tai, Y. W., Brown, M. S., & Kweon, I. S. (2011). High quality depth map upsampling for 3d-tof cameras. In Proceedings of the IEEE Conference on Computer Vision (ICCV).
Park, J., Kim, H., Tai, Y. W., Brown, M. S., & Kweon, I. S. (2014). High-quality depth map upsampling and completion for rgb-d cameras. IEEE Transactions on Image Processing (TIP), 23(12), 5559–5572.
Park, J., Sinha, S. N., Matsushita, Y., Tai, Y. W., & Kweon, I. S. (2013). Multiview photometric stereo using planar mesh parameterization. In Proceedings of the IEEE Conference on Computer Vision (ICCV).
Salamati, N., Fredembach, C., & Süsstrunk, S. (2009). Material classification using color and nir images. In Proc. of IS&T/SID 17th Color Imaging Conference (CIC).
Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., & Szeliski, R. (2006). A comparison and evaluation of multi-view stereo reconstruction algorithms. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Shan, Q., Adams, R., Curless, B., Furukawa, Y., & Seitz, S. M. (2013). The visual turing test for scene reconstruction. In Proceedings of the International Conference on 3D Vision (3DV)
Shen, J., & Cheung, S. C. S. (2013). Layer depth denoising and completion for structured-light rgb-d cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Shi, B., Inose, K., Matsushita, Y., Tan, P., Yeung, S. K., & Ikeuchi, K. (2014). Photometric stereo using internet images. In Proceedings of the International Conference on 3D Vision (3DV).
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., et al. (2011). Real-time human pose recognition in parts from a single depth image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Surazhsky, V., & Gotsman, C. (2003). Explicit surface remeshing. In Proceedings of the 2003 Eurographics/ACM SIGGRAPH symposium on Geometry processing.
Suwajanakorn, S., Kemelmacher-Shlizerman, I., & Seitz, S. M. (2014). Total moving face reconstruction. In Proceedings of the European Conference on Computer Vision (ECCV).
Vlasic, D., Peers, P., Baran, I., Debevec, P., Popović, J., Rusinkiewicz, S., et al. (2009). Dynamic shape capture using multi-view photometric stereo. ACM Transactions on Graphics (TOG), 28(5), 174.
Wu, C., Wilburn, B., Matsushita, Y., & Theobalt, C. (2011). High-quality shape from multi-view stereo and shading under general illumination. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Wu, C., Zollhöfer, M., Niessner, M., Stamminger, M., Izadi, S., & Theobalt, C. (2014). Real-time shading-based refinement for consumer depth cameras. In Proc. SIGGRAPH Asia
Yang, Q., Yang, R., Davis, J., & Nistér, D. (2007). Spatial-depth super resolution for range images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Yu, L. F., Yeung, S. K., Tai, Y. W., & Lin, S. (2013). Shading-based shape refinement of rgb-d images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Zhang, Q., Ye, M., Yang, R., Matsushita, Y., Wilburn, B., & Yu, H. (2012). Edge-preserving photometric stereo via depth fusion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Zhou, Q. Y., & Koltun, V. (2014). Color map optimization for 3d reconstruction with consumer depth cameras. ACM Transactions on Graphics (TOG), 33(4), 155.
Zollhöfer, M., Dai, A., Innmann, M., Wu, C., Stamminger, M., Theobalt, C., et al. (2015). Shading-based refinement on volumetric signed distance functions. ACM Transactions on Graphics (TOG), 34(4), 96.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Yasuhiro Mukaigawa.
Rights and permissions
About this article
Cite this article
Choe, G., Park, J., Tai, YW. et al. Refining Geometry from Depth Sensors using IR Shading Images. Int J Comput Vis 122, 1–16 (2017). https://doi.org/10.1007/s11263-016-0937-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-016-0937-y