Advertisement

Confocal Stereo

  • Samuel W. Hasinoff
  • Kiriakos N. Kutulakos
Article

Abstract

We present confocal stereo, a new method for computing 3D shape by controlling the focus and aperture of a lens. The method is specifically designed for reconstructing scenes with high geometric complexity or fine-scale texture. To achieve this, we introduce the confocal constancy property, which states that as the lens aperture varies, the pixel intensity of a visible in-focus scene point will vary in a scene-independent way, that can be predicted by prior radiometric lens calibration. The only requirement is that incoming radiance within the cone subtended by the largest aperture is nearly constant. First, we develop a detailed lens model that factors out the distortions in high resolution SLR cameras (12MP or more) with large-aperture lenses (e.g., f1.2). This allows us to assemble an A×F aperture-focus image (AFI) for each pixel, that collects the undistorted measurements over all A apertures and F focus settings. In the AFI representation, confocal constancy reduces to color comparisons within regions of the AFI, and leads to focus metrics that can be evaluated separately for each pixel. We propose two such metrics and present initial reconstruction results for complex scenes, as well as for a scene with known ground-truth shape.

Keywords

Defocus Depth from focus Depth from defocus 3D reconstruction Stereo Camera calibration Wide-aperture lenses 

References

  1. Adelson, E. H., & Wang, J. Y. A. (1992). Single lens stereo with a plenoptic camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 99–106. CrossRefGoogle Scholar
  2. Asada, N., Fujiwara, H., & Matsuyama, T. (1998a). Edge and depth from focus. International Journal of Computer Vision, 26(2), 153–163. CrossRefGoogle Scholar
  3. Asada, N., Fujiwara, H., & Matsuyama, T. (1998b). Seeing behind the scene: Analysis of photometric properties of occluding edges by the reversed projection blurring model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(2), 155–167. CrossRefGoogle Scholar
  4. Baker, S., & Matthews, I. (2004). Lucas-Kanade 20 years on: A unifying framework. International Journal of Computer Vision, 56(3), 221–225. CrossRefGoogle Scholar
  5. Bertalmio, M., Sapiro, G., Caselles, V., & Ballester, C. (2000). Image inpainting. In Proc. ACM SIGGRAPH (pp. 417–424). Google Scholar
  6. Bhasin, S. S., & Chaudhuri, S. (2001). Depth from defocus in presence of partial self occlusion. Proc. International Conference on Computer Vision, 2, 488–493. Google Scholar
  7. Bouguet, J.-Y. (2004). Camera calibration toolbox for Matlab (Oct. 14, 2004). http://vision.caltech.edu/bouguetj/calib_doc/.
  8. Darrell, T., & Wohn, K. (1988). Pyramid based depth from focus. In Proc. computer vision and pattern recognition (pp. 504–509). Google Scholar
  9. Debevec, P., & Malik, J. (1997). Recovering high dynamic range radiance maps from photographs. In Proc. ACM SIGGRAPH (pp. 369–378). Google Scholar
  10. Farid, H., & Simoncelli, E. P. (1998). Range estimation by optical differentiation. Journal of the Optical Society of America A, 15(7), 1777–1786. CrossRefGoogle Scholar
  11. Favaro, P., & Soatto, S. (2002). Learning shape from defocus. Proc. European Conference on Computer Vision, 2, 735–745. Google Scholar
  12. Favaro, P., & Soatto, S. (2003). Seeing beyond occlusions (and other marvels of a finite lens aperture). In. Proc. Computer Vision and Pattern Recognition, 2, 579–586. Google Scholar
  13. Favaro, P., & Soatto, S. (2005). A geometric approach to shape from defocus. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(3). Google Scholar
  14. Favaro, P., Mennucci, A., & Soatto, S. (2003a). Observing shape from defocused images. International Journal of Computer Vision, 52(1), 25–43. zbMATHCrossRefGoogle Scholar
  15. Favaro, P., Osher, S., Soatto, S., & Vese, L. A. (2003b). 3D shape from anisotropic diffusion. Proc. Computer Vision and Pattern Recognition, 1, 179–186. Google Scholar
  16. Fitzgibbon, A., Wexler, Y., & Zisserman, A. (2005). Image-based rendering using image-based priors. International Journal of Computer Vision, 63(2), 141–151. CrossRefGoogle Scholar
  17. Fraser, C. S., & Shortis, M. R. (1992). Variation of distortion within the photographic field. Photogrammetric Engineering and Remote Sensing, 58(6), 851–855. Google Scholar
  18. Green, P., Sun, W., Matusik, W., & Durand, F. (2007). Multi-aperture photography. In Proc. ACM SIGGRAPH. Google Scholar
  19. Grossberg, M. D., & Nayar, S. K. (2004). Modeling the space of camera response functions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(10), 1272–1282. CrossRefGoogle Scholar
  20. Hasinoff, S. W., & Kutulakos, K. N. (2007). A layer-based restoration framework for variable-aperture photography. In Proc. international conference on computer vision. Google Scholar
  21. Healey, G. E., & Kondepudy, R. (1994). Radiometric CCD camera calibration and noise estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(3), 267–276. CrossRefGoogle Scholar
  22. Hertzmann, A., & Seitz, S. M. (2005). Example-based photometric stereo: Shape reconstruction with general, varying BRDFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1254–1264. CrossRefGoogle Scholar
  23. Isaksen, A., McMillan, L., & Gortler, S. J. (2000). Dynamically reparameterized light fields. In Proc. ACM SIGGRAPH (pp. 297–306). Google Scholar
  24. Jin, H., & Favaro, P. (2002). A variational approach to shape from defocus. Proc. European Conference on Computer Vision, 2, 18–30. Google Scholar
  25. Kang, S. B., & Weiss, R. S. (2000). Can we calibrate a camera using an image of a flat, textureless Lambertian surface? Proc. European Conference on Computer Vision, 2, 640–653. Google Scholar
  26. Krotkov, E. (1987). Focusing. International Journal of Computer Vision, 1(3), 223–237. CrossRefGoogle Scholar
  27. Kubota, A., Takahashi, K., Aizawa, K., & Chen, T. (2004). All-focused light field rendering. In Proc. eurographics symposium on rendering. Google Scholar
  28. Kutulakos, K. N., & Seitz, S. M. (2000). A theory of shape by shape carving. International Journal of Computer Vision, 38(3), 197–216. CrossRefGoogle Scholar
  29. Levin, A., Fergus, R., Durand, F., & Freeman, W. T. (2007). Image and depth from a conventional camera with a coded aperture. In Proc. ACM SIGGRAPH. Google Scholar
  30. Levoy, M., & Hanrahan, P. (1996). Light field rendering. In Proc. ACM SIGGRAPH (pp. 31–42). Google Scholar
  31. Levoy, M., Chen, B., Vaish, V., Horowitz, M., McDowall, I., & Bolas, M. T. (2004). Synthetic aperture confocal imaging. In Proc. ACM SIGGRAPH (pp. 825–834). Google Scholar
  32. McGuire, M., Matusik, W., Pfister, H., Hughes, J. F., & Durand, F. (2005). Defocus video matting. In Proc. ACM SIGGRAPH (pp. 567–576). Google Scholar
  33. Moreno-Noguer, F., Belhumeur, P. N., & Nayar, S. K. (2007). Active refocusing of images and videos. In Proc. ACM SIGGRAPH. Google Scholar
  34. Nair, H., & Stewart, C. (1992). Robust focus ranging. In Proc. computer vision and pattern recognition (pp. 309–314). Google Scholar
  35. Nayar, S., Watanabe, M., & Noguchi, M. (1996). Real-time focus range sensor. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(12), 1186–1198. CrossRefGoogle Scholar
  36. Ng, R. (2005). Fourier slice photography. In Proc. ACM SIGGRAPH (pp. 735–744). Google Scholar
  37. Paris, S., Briceño, H., & Sillion, F. (2004). Capture of hair geometry from multiple images. In Proc. ACM SIGGRAPH (pp. 712–719). Google Scholar
  38. Park, S. C., Park, M. K., & Kang, M. G. (2003). Super-resolution image reconstruction: a technical overview. IEEE Signal Processing Magazine, 20(3), 21–36. CrossRefGoogle Scholar
  39. Pentland, A. P. (1987). A new sense for depth of field. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9(4), 523–531. CrossRefGoogle Scholar
  40. Rajagopalan, A. N., & Chaudhuri, S. (1999). An MRF model-based approach to simultaneous recovery of depth and restoration from defocused images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(7), 577–589. CrossRefGoogle Scholar
  41. Schechner, Y. Y., & Kiryati, N. (2000). Depth from defocus vs. stereo: How different really are they? International Journal of Computer Vision, 39(2), 141–162. zbMATHCrossRefGoogle Scholar
  42. Smith, W. J. (2000). Modern Optical Engineering (3rd ed.) New York: McGraw-Hill. Google Scholar
  43. Subbarao, M., & Surya, G. (1994). Depth from defocus: A spatial domain approach. International Journal of Computer Vision, 13(3), 271–294. CrossRefGoogle Scholar
  44. Technical Innovations. http://www.robofocus.com/.
  45. Vaish, V., Szeliski, R., Zitnick, C. L., & Kang, S. B. (2006). Reconstructing occluded surfaces using synthetic apertures: Stereo, focus and robust measures. In Proc. computer vision and pattern recognition (pp. 2331–2338). Google Scholar
  46. Veeraraghavan, A., Raskar, R., Agrawal, A., Mohan, A., & Tumblin, J. (2007). Dappled photography: Mask enhanced cameras for heterodyned light fields and coded aperture refocusing. In Proc. ACM SIGGRAPH. Google Scholar
  47. Watanabe, M., & Nayar, S. K. (1997). Telecentric optics for focus analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(12), 1360–1365. CrossRefGoogle Scholar
  48. Watanabe, M., & Nayar, S. K. (1998). Rational filters for passive depth from defocus. International Journal of Computer Vision, 27(3), 203–225. CrossRefGoogle Scholar
  49. Webb, R. H. (1996). Confocal optical microscopy. Reports on Progress in Physics, 59(3), 427–471. CrossRefMathSciNetGoogle Scholar
  50. Wei, Y., Ofek, E., Quan, L., & Shum, H.-Y. (2005). Modeling hair from multiple views. In Proc. ACM SIGGRAPH (pp. 816–820). Google Scholar
  51. Willson, R. (1994a). Modeling and calibration of automated zoom lenses. In Proc. SPIE #2350: Videometrics III (pp. 170–186). Google Scholar
  52. Willson, R. (1994b). Modeling and calibration of automated zoom lenses. PhD thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA. Google Scholar
  53. Willson, R., & Shafer, S. (1994). What is the center of the image? Journal of the Optical Society of America A, 11(11), 2946–2955. CrossRefGoogle Scholar
  54. Xiong, Y., & Shafer, S. (1997). Moment and hypergeometric filters for high precision computation of focus, stereo and optical flow. International Journal of Computer Vision, 22(1), 25–59. CrossRefGoogle Scholar
  55. Zhang, L., & Nayar, S. K. (2006). Projection defocus analysis for scene capture and image display. In Proc. ACM SIGGRAPH (pp. 907–915). Google Scholar
  56. Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., & Szeliski, R. (2004). High-quality video view interpolation using a layered representation. In Proc. ACM SIGGRAPH (pp. 600–608). Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of TorontoTorontoCanada

Personalised recommendations