Abstract
Traditional depth estimation methods typically exploit the effect of either the variations in internal parameters such as aperture and focus (as in depth from defocus), or variations in extrinsic parameters such as position and orientation of the camera (as in stereo). When operating off-the-shelf (OTS) cameras in a general setting, these parameters influence the depth of field (DOF) and field of view (FOV). While DOF mandates one to deal with defocus blur, a larger FOV necessitates camera motion during image acquisition. As a result, for unfettered operation of an OTS camera, it becomes inevitable to account for pixel motion as well as optical defocus blur in the captured images. We propose a depth estimation framework using calibrated images captured under general camera motion and lens parameter variations. Our formulation seeks to generalize the constrained areas of stereo and shape from defocus (SFD)/focus (SFF) by handling, in tandem, various effects such as focus variation, zoom, parallax and stereo occlusions, all under one roof. One of the associated challenges in such an unrestrained scenario is the problem of removing user-defined foreground occluders in the reference depth map and image (termed inpainting of depth and image). Inpainting is achieved by exploiting the cue from motion parallax to discover (in other images) the correspondence/color information missing in the reference image. Moreover, considering the fact that the observations could be differently blurred, it is important to ensure that the degree of defocus in the missing regions (in the reference image) is coherent with the local neighbours (defocus inpainting).
Similar content being viewed by others
References
Ahuja, N., & Abbot, A. L. (1993). Active stereo: Integrating disparity, vergence, focus, aperture and calibration for surface estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(10), 1007–1029.
Bhavsar, A. V., & Rajagopalan, A. N. (2008). Range map with missing data—joint resolution enhancement and inpainting. In Indian conference on computer vision, graphics and image processing (ICVGIP 2008) (pp. 359–365).
Bhavsar, A. V., & Rajagopalan, A. N. (2009). Depth estimation with a practical camera. In British machine vision conference (BMVC 2009).
Bhavsar, A. V., & Rajagopalan, A. N. (2010). Depth estimation and inpainting with an unconstrained camera. In British machine vision conference (BMVC 2010).
Chan, T., & Kang, S. (2006). Error analysis for image inpainting. Journal of Mathematical Imaging and Vision, 26(1), 85–103.
Criminisi, A., Perez, P., & Toyama, K. (2003). Object removal by exemplar-based inpainting. In Proc. IEEE computer society conference on computer vision and pattern recognition (CVPR 2003) (pp. 721–728).
Deschenes, F., Ziou, D., & Fuchs, P. (2004). A unified approach for a simultaneous and cooperative estimation of defocus blur and spatial shifts. Image and Vision Computing, 22(1), 35–57.
Dorin, C., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 603–619.
Drouin, M., Trudeau, M., & Roy, S. (2005). Geo-consistency for wide multi-camera stereo. In IEEE conference on computer vision and pattern recognition (CVPR 2005) (Vol. 1, pp. 351–358).
Duo, Q., & Favaro, P. (2008). Off-axis aperture camera: 3d shape reconstruction and image restoration. In IEEE conference on computer vision and pattern recognition (CVPR 2008) (pp. 1–7).
Ens, J., & Lawrence, P. (1993). An investigation of methods for determining depth from focus. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(2), 97–108.
Favaro, P., & Soatto, S. (2005). A geometric approach to shape from defocus. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(3), 406–417.
Favaro, P., Soatto, S., Burger, M., & Osher, S. (2008). Shape from defocus via diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(3), 518–531.
Felzenszwalb, P., & Huttenlocher, D. (2004). Efficient belief propagation for early vision. In IEEE computer society conference on computer vision and pattern recognition (CVPR 2004) (Vol. 1, pp. 261–268).
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24, 381–395.
Frese, C., & Gheta, I. (2006). Robust depth estimation by fusion of stereo and focus series acquired with a camera array. In IEEE international conference on multisensor fusion and integration for intelligent systems (pp. 243–248).
Gu, J., Ramamoorthi, R., Belhumeur, P., & Nayar, S. (2009). Removing image artifacts due to dirty camera lenses and thin occluders. In SIGGRAPH Asia ’09: ACM SIGGRAPH Asia 2009 papers (pp. 1–10).
Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.
Kang, S., Szeliski, R., & Chai, J. (2001). Handling occlusions in dense multi-view stereo (Tech. rep.). Microsoft Technical Report MSR-TR-2001-80.
Kim, J., & Sikora, T. (2007). Confocal disparity estimation and recovery of pinhole image for real aperture stereo camera systems. In IEEE international conference image processing (ICIP 2007) (Vol. V, pp. 229–232).
Krotkov, E., & Bajcsy, R. (1993). Active vision for reliable ranging: Cooperating focus, stereo and vergence. International Journal of Computer Vision, 11(2), 187–203.
Li, S. (1995). Markov random field modeling in computer vision. Tokyo: Springer.
Myles, Z., & Lobo, N. V. (1998). Recovering affine motion and defocus blur simultaneously. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(6), 652–658.
Nayar, S. K., & Nakagawa, Y. (1994). Shape from focus. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(8), 824–831.
Paramanand, C., & Rajagopalan, A. N. (2010). Unscented transformation for depth from motion-blur in videos. In IEEE workshop on three dimensional information extraction for video analysis and mining.
Pentland, A. (1987). A new sense for depth of field. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9(4), 523–531.
Rajagopalan, A. N., & Chaudhuri, S. (1999). Depth from defocus: AÂ real aperture imaging approach. New York: Springer.
Rajagopalan, A. N., Chaudhuri, S., & Mudenagudi, U. (2004). Depth estimation and image restoration using defocused stereo pairs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(11), 1521–1525.
Sahay, R., & Rajagopalan, A. N. (2008). A model-based approach to shape from focus. In Proceedings of the 3rd international conference on computer vision theory and applications (VISAPP 2008) (pp. 243–250).
Sahay, R., & Rajagopalan, A. N. (2009a). Inpainting in shape from focus: Taking a cue from motion parallax. In British machine vision conference (BMVC 2009).
Sahay, R., & Rajagopalan, A. N. (2009b). Real aperture axial stereo: Solving for correspondences in blur. In DAGM-Symposium 2009 (pp. 362–371).
Sahay, R., & Rajagopalan, A. N. (2010). Joint image and depth completion in shape-from-focus: Taking a cue from parallax. Journal of Optical Society of America. A, 27(5), 1203–1213.
Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 47(1), 7–42.
Seitz, S., & Baker, S. (2009). Filter flow. In Proc. international conference on computer vision (ICCV 2009).
Seitz, S., Curless, B., Diebel, J., Scharstein, D., & Szeliski, R. (2006). A comparison and evaluation of multi-view stereo reconstruction algorithms. In IEEE conference on computer vision and pattern recognition (CVPR 2006) (Vol. 1, pp. 519–526).
Sorel, M. (2007). Multichannel blind restoration of images with space-variant degradations. PhD Thesis, Charles University.
Strecha, C., Van Gool, L., & Fransens, R. (2004). Wide-baseline stereo from multiple views: a probabilistic account. In IEEE conference on computer vision and pattern recognition (CVPR 2004) (Vol. 2, pp. 552–559).
Subbarao, M., Yuan, T., & Tyan, J. (1997). Integration of defocus and focus analysis with stereo for 3d shape recovery. Proceedings of SPIE, 3204, 11–23.
Wang, L., Jin, H., Yang, R., & Gong, M. (2008). Stereoscopic inpainting: Joint color and depth completion from stereo images. In Proc. IEEE computer society conference on computer vision and pattern recognition (CVPR 2008) (pp. 1–8).
Watanabe, M., & Nayar, S. (1995). Telecentric optics for computational vision. In European conference on computer vision (ECCV 1995) (pp. 439–451).
Wohler, C., d’Angelo, P., Kruger, L., Kuhl, A., & Grob, H. M. (2009). Monocular 3d scene reconstruction at absolute scale. ISPRS Journal of Photogrammetry and Remote Sensing, 64(6), 529–540.
Wrotniak, J. A. (2003). Depth of field tables for olympus c-30x0z, c-40x0z, and c-5050z cameras. http://www.wrotniak.net/photo/tech/dof-c5050.html.
Wrotniak, J. A. (2006). Depth of field tables for the olympus c-5050z, c-4040z, and c-3040z digital cameras. http://www.digitaldiver.net/lib_docs/oly_dof.html.
Yang, Q., Wang, L., Yang, R., Stewenius, H., & Nister, D. (2009). Stereo matching with color-weighted correlation, heiarchical belief propagation, and occlusion handling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(3), 492–504.
Zhou, C., & Lin, S. (2007). Removal of image artifacts due to sensor dust. In Proc. IEEE computer society conference on computer vision and pattern recognition (CVPR 2007) (pp. 1–8).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bhavsar, A.V., Rajagopalan, A.N. Towards Unrestrained Depth Inference with Coherent Occlusion Filling. Int J Comput Vis 97, 167–190 (2012). https://doi.org/10.1007/s11263-011-0476-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-011-0476-5