FreeCam3D: Snapshot Structured Light 3D with Freely-Moving Cameras

Wu, Yicheng; Boominathan, Vivek; Zhao, Xuan; Robinson, Jacob T.; Kawasaki, Hiroshi; Sankaranarayanan, Aswin; Veeraraghavan, Ashok

doi:10.1007/978-3-030-58583-9_19

Yicheng Wu¹²,
Vivek Boominathan¹²,
Xuan Zhao¹²,
Jacob T. Robinson¹²,
Hiroshi Kawasaki¹³,
Aswin Sankaranarayanan¹⁴ &
…
Ashok Veeraraghavan¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12372))

Included in the following conference series:

European Conference on Computer Vision

3893 Accesses
6 Citations

Abstract

A 3D imaging and mapping system that can handle both multiple-viewers and dynamic-objects is attractive for many applications. We propose a freeform structured light system that does not rigidly constrain camera(s) to the projector. By introducing an optimized phase-coded aperture in the projector, we transform the projector pattern to encode depth in its defocus robustly; this allows a camera to estimate depth, in projector coordinates, using local information. Additionally, we project a Kronecker-multiplexed pattern that provides global context to establish correspondence between camera and projector pixels. Together with aperture coding and projected pattern, the projector offers a unique 3D labeling for every location of the scene. The projected pattern can be observed in part or full by any camera, to reconstruct both the 3D map of the scene and the camera pose in the projector coordinates. This system is optimized using a fully differentiable rendering model and a CNN-based reconstruction. We build a prototype and demonstrate high-quality 3D reconstruction with an unconstrained camera, for both dynamic scenes and multi-camera systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Benveniste, R., Ünsalan, C.: A color invariant based binary coded structured light range scanner for shiny objects. In: International Conference on Pattern Recognition (ICPR), pp. 798–801 (2010)
Google Scholar
Bradski, G., Kaehler, A.: Learning OpenCV: Computer Vision with the OpenCV Library. O’Reilly Media Inc., Sebastopol (2008)
Google Scholar
Chakrabarti, A.: Learning sensor multiplexing design through back-propagation. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 3081–3089 (2016)
Google Scholar
Chang, J., Wetzstein, G.: Deep optics for monocular depth estimation and 3D object detection. In: IEEE International Conference on Computer Vision (ICCV), pp. 10193–10202 (2019)
Google Scholar
Farid, H., Simoncelli, E.P.: Range estimation by optical differentiation. J. Opt. Soc. Am. A (JOSA A) 15(7), 1777–1786 (1998)
Article Google Scholar
Furukawa, R., Nagamatsu, G., Kawasaki, H.: Simultaneous shape registration and active stereo shape reconstruction using modified bundle adjustment. In: International Conference on 3D Vision (3DV), pp. 453–462 (2019)
Google Scholar
Furukawa, R., et al.: 3D endoscope system using asynchronously blinking grid pattern projection for HDR image synthesis. In: Cardoso, M.J., et al. (eds.) CARE/CLIP -2017. LNCS, vol. 10550, pp. 16–28. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67543-5_2
Chapter Google Scholar
Gao, X.S., Hou, X.R., Tang, J., Cheng, H.F.: Complete solution classification for the perspective-three-point problem. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 25(8), 930–943 (2003)
Article Google Scholar
Girod, B., Scherock, S.: Depth from defocus of structured light. In: Optics, Illumination, and Image Sensing for Machine Vision IV, vol. 1194, pp. 209–215 (1990)
Google Scholar
Godard, C., Mac Aodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 270–279 (2017)
Google Scholar
Goodman, J.W.: Introduction to Fourier optics. Roberts and Company Publishers, Greenwood Village (2005)
Google Scholar
Guo, Q., Alexander, E., Zickler, T.: Focal track: depth and accommodation with oscillating lens deformation. In: IEEE International Conference on Computer Vision (ICCV), pp. 966–974 (2017)
Google Scholar
Haim, H., Elmalem, S., Giryes, R., Bronstein, A.M., Marom, E.: Depth estimation from a single image using deep learned phase coded mask. IEEE Trans. Comput. Imaging (TCI) 4(3), 298–310 (2018)
Article Google Scholar
Hitoshi, M., Hiroshi, K., Ryo, F.: Depth from projector’s defocus based on multiple focus pattern projection. IPSJ Trans. Comput. Vis. Appl. (CVA) 6, 88–92 (2014)
Article Google Scholar
Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 2017–2025 (2015)
Google Scholar
Kawasaki, H., Furukawa, R., Sagawa, R., Yagi, Y.: Dynamic scene shape reconstruction using a single structured light pattern. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8. IEEE (2008)
Google Scholar
Kawasaki, H., Horita, Y., Masuyama, H., Ono, S., Kimura, M., Takane, Y.: Optimized aperture for estimating depth from projector’s defocus. In: International Conference on 3D Vision (3DV), pp. 135–142 (2013)
Google Scholar
Kawasaki, H., et al.: Structured light with coded aperture for wide range 3D measurement. In: IEEE Conference on Image Processing (ICIP), pp. 2777–2780 (2012)
Google Scholar
Lee, J., Gupta, M.: Stochastic exposure coding for handling multi-ToF-camera interference. In: IEEE International Conference on Computer Vision (ICCV), pp. 7880–7888 (2019)
Google Scholar
Lei, Y., Bengtson, K.R., Li, L., Allebach, J.P.: Design and decoding of an m-array pattern for low-cost structured light 3D reconstruction systems. In: IEEE International Conference on Image Processing (ICIP), pp. 2168–2172 (2013)
Google Scholar
Lepetit, V., Moreno-Noguer, F., Fua, P.: EPnP: An accurate o(\(n\)) solution to the P\(n\)P problem. Int. J. Comput. Vis. (IJCV) 81(2), 155 (2009). https://doi.org/10.1007/s11263-008-0152-6
Article Google Scholar
Levin, A., Fergus, R., Durand, F., Freeman, W.T.: Image and depth from a conventional camera with a coded aperture. ACM Trans. Graph. (TOG) 26(3), 70 (2007)
Article Google Scholar
Li, Q., Biswas, M., Pickering, M.R., Frater, M.R.: Accurate depth estimation using structured light and passive stereo disparity estimation. In: IEEE International Conference on Image Processing (ICIP), pp. 969–972 (2011)
Google Scholar
Li, W., et al.: InteriorNet: mega-scale multi-sensor photo-realistic indoor scenes dataset. arXiv:1809.00716 (2018)
Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4040–4048 (2016)
Google Scholar
McCormac, J., Handa, A., Leutenegger, S., Davison, A.J.: SceneNet RGB-D: 5M photorealistic images of synthetic indoor trajectories with ground truth. arXiv:1612.05079 (2016)
Metzler, C.A., Ikoma, H., Peng, Y., Wetzstein, G.: Deep optics for single-shot high-dynamic-range imaging. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1375–1385 (2020)
Google Scholar
Microsoft: Xbox 360 Kinect (2010). http://www.xbox.com/en-US/kinect
Microsoft: Kinect for Windows (2013). http://www.microsoft.com/en-us/
Nayar, S., Watanabe, M., Noguchi, M.: Real-time focus range sensor. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 18(12), 1186–1198 (1996)
Article Google Scholar
Pavani, S.R.P., et al.: Three-dimensional, single-molecule fluorescence imaging beyond the diffraction limit by using a double-helix point spread function. Proc. Natl. Acad. Sci. (PNAS) 106(9), 2995–2999 (2009)
Article Google Scholar
Riegler, G., Liao, Y., Donne, S., Koltun, V., Geiger, A.: Connecting the dots: learning representations for active monocular depth estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7624–7633 (2019)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention (MICCAI), pp. 234–241 (2015)
Google Scholar
Salvi, J., Fernandez, S., Pribanic, T., Llado, X.: A state of the art in structured light patterns for surface profilometry. Pattern Recogn. 43(8), 2666–2680 (2010)
Article Google Scholar
Shechtman, Y., Sahl, S.J., Backer, A.S., Moerner, W.: Optimal point spread function design for 3D imaging. Phys. Rev. Lett. (PRL) 113(13), 133902 (2014)
Article Google Scholar
Sitzmann, V., et al.: End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging. ACM Trans. Graph. (TOG) 37(4), 1–13 (2018)
Article Google Scholar
Sun, Q., Tseng, E., Fu, Q., Heidrich, W., Heide, F.: Learning rank-1 diffractive optics for single-shot high dynamic range imaging. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1386–1396 (2020)
Google Scholar
Tang, S., Zhang, X., Tu, D.: Fuzzy decoding in color-coded structured light. Opt. Eng. 53(10), 104104 (2014)
Article Google Scholar
Torr, P.H., Zisserman, A.: MLESAC: a new robust estimator with application to estimating image geometry. Comput. Vis. Image Underst. (CVIU) 78(1), 138–156 (2000)
Article Google Scholar
Ulusoy, A.O., Calakli, F., Taubin, G.: Robust one-shot 3D scanning using loopy belief propagation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 15–22 (2010)
Google Scholar
Veeraraghavan, A., Raskar, R., Agrawal, A., Mohan, A., Tumblin, J.: Dappled photography: mask enhanced cameras for heterodyned light fields and coded aperture refocusing. ACM Trans. Graph. (TOG) 26(3), 69 (2007)
Article Google Scholar
Watanabe, M., Nayar, S.K.: Rational filters for passive depth from defocus. Int. J. Comput. Vis. (IJCV) 27(3), 203–225 (1998). https://doi.org/10.1023/A:1007905828438
Article Google Scholar
Wu, Y., Boominathan, V., Chen, H., Sankaranarayanan, A., Veeraraghavan, A.: PhaseCam3D-learning phase masks for passive single view depth estimation. In: IEEE International Conference on Computational Photography (ICCP), pp. 1–12 (2019)
Google Scholar
Zhang, X., Li, Y., Zhu, L.: Color code identification in coded structured light. Appl. Opt. 51(22), 5340–5356 (2012)
Article Google Scholar
Zhou, T., Brown, M., Snavely, N., Lowe, D.G.: Unsupervised learning of depth and ego-motion from video. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1851–1858 (2017)
Google Scholar

Download references

Acknowledgement

This work was supported in part by NSF grants IIS1652633 and CCF1652569, DARPA NESD program N66001-17-C-4012, and JSPS KAKENHI grants JP20H00611 and JP16KK0151.

Author information

Authors and Affiliations

Rice University, Houston, TX, USA
Yicheng Wu, Vivek Boominathan, Xuan Zhao, Jacob T. Robinson & Ashok Veeraraghavan
Kyushu University, Fukuoka, Japan
Hiroshi Kawasaki
Carnegie Mellon University, Pittsburgh, PA, USA
Aswin Sankaranarayanan

Authors

Yicheng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Vivek Boominathan
View author publications
You can also search for this author in PubMed Google Scholar
Xuan Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jacob T. Robinson
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Kawasaki
View author publications
You can also search for this author in PubMed Google Scholar
Aswin Sankaranarayanan
View author publications
You can also search for this author in PubMed Google Scholar
Ashok Veeraraghavan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ashok Veeraraghavan .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 96667 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, Y. et al. (2020). FreeCam3D: Snapshot Structured Light 3D with Freely-Moving Cameras. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12372. Springer, Cham. https://doi.org/10.1007/978-3-030-58583-9_19

Download citation

DOI: https://doi.org/10.1007/978-3-030-58583-9_19
Published: 19 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58582-2
Online ISBN: 978-3-030-58583-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics