Skip to main content
Log in

Occlusion-Aware Stereo Matching

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Stereo vision systems with additional flash/no-flash cues have been demonstrated to be robust to depth discontinuities. The ratio of a flash and no-flash image pair naturally provides additional scene depth information and thus can serve as a strong cue for preserving depth discontinuities. However, existing solution simply uses ratio as the guidance to perform matching cost aggregation and thus is still vulnerable to occlusions. Inevitable misalignment of flash and no-flash images due to camera and/or scene motion remains unsolved as well. This paper investigates into these two problems. An occlusion detection approach is derived based on foreground/background extraction. Matching cost computed in the occluded regions (which is useless and harmful) is thus discarded so that reliable information from non-occluded regions can be easily propagated in. The foreground, occlusion and depth estimation is modeled in a uniform framework base on Expectation-Maximum. The proposed solution is evaluated using both indoor and outdoor data sets, showing clear improvement over the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. Most of the current commercial active sensors are not reliable under outdoor environment and thus only indoor environment was tested.

References

  • Bastanlar, Y., Temizel, A., Yardimci, Y., & Sturm, P. (2012). Multi-view structure-from-motion for hybrid camera scenarios. Image and Vision Computing, 30(8), 557–572.

    Article  Google Scholar 

  • Blake, A., Rother, C., Brown, M., Perez, P., & Torr, P. (2004). Interactive image segmentation using an adaptive gmmrf model. In ECCV (pp. 428–441).

  • Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. PAMI, 23(11), 1222–1239.

    Article  Google Scholar 

  • Chen, C., Lin, H., Yu, Z., Kang, S., & Yu, J. (2014). Light field stereo matching using bilateral statistics of surface cameras. In CVPR.

  • Gastal, E. S. L., & Oliveira, M. M. (2011). Domain transform for edge-aware image and video processing. TOG, 30(4), 69:1–69:12.

    Article  Google Scholar 

  • He, K., Sun, J., & Tang, X. (2013). Guided image filtering. PAMI, 35, 1397–1409.

    Article  Google Scholar 

  • Hirschmuller, H., & Scharstein, D. (2009). Evaluation of stereo matching costs on images with radiometric differences. PAMI, 31(9), 1582–1599.

    Article  Google Scholar 

  • Hosni, A., Rhemann, C., Bleyer, M., Rother, C., & Gelautz, M. (2013). Fast cost-volume filtering for visual correspondence and beyond. PAMI, 35, 504–511.

    Article  Google Scholar 

  • Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., & Fitzgibbon, A. (2011). Kinectfusion: Real-time 3d reconstruction and interaction using a moving depth camera. In UIST (pp. 559–568).

  • Kaehler, O., & Reid, I. (2013). Efficient 3d scene labeling using fields of trees. In ICCV (pp. 3064–3071).

  • Liu, C., Yuen, J., Torralba, A., Sivic, J., & Freeman, W. T. (2008). Sift flow: Dense correspondence across different scenes. In ECCV (pp. 28–42).

  • Ma, Z., He, K., Wei, Y., Sun, J., & Wu, E. (2013). Constant time weighted median filtering for stereo matching and beyond. In ICCV.

  • Murray, D., & Little, J. (2000). Using real-time stereo vision for mobile robot navigation. Autonomous Robots, 8(2), 161–171.

    Article  Google Scholar 

  • Point-gray stereo camera. (2015). http://www.ptgrey.com//bumblebee2-firewire-stereo-vision-camera-systems.

  • Prisacariu, V., & Reid, I. (2012). 3d hand tracking for human computer interaction. Image and Vision Computing, 30(3), 236–250.

    Article  MathSciNet  Google Scholar 

  • Ren, C., Prisacariu, V., Murray, D., & Reid, I. (2013). Star3d: Simultaneous tracking and reconstruction of 3d objects using rgb-d data. In ICCV (pp. 1561–1568).

  • Riegl vz 1000 scanner. http://www.riegl.com/nc/products/terrestrial-scanning/produktdetail/product/scanner/27/.

  • Rothganger, F., Lazebnik, S., Schmid, C., & Ponce, J. (2006). 3d object modeling and recognition using local affine-invariant image descriptors and multi-view spatial constraints. IJCV, 66(3), 231–259.

    Article  Google Scholar 

  • Scharstein, D., & Szeliski, R. Middlebury stereo evaluation. http://vision.middlebury.edu/stereo/eval/.

  • Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV, 47, 7–42.

    Article  MATH  Google Scholar 

  • Softkinetic depth sensor. (2015). http://www.softkinetic.com/Products/DepthSenseCameras.

  • Sun, J., Li, Y., Kang, S., & Shum, H. (2006). Flash matting. In SIGGRAPH (pp. 772–778).

  • Sun, J., Sun, J., Kang, S., Xu, Z., Tang, X., & Shum, H. (2007). Flash cut: Foreground extraction with flash and no-flash image pairs. In CVPR.

  • Sun, D., Roth, S., & Black, M. (2014). A quantitative analysis of current practices in optical flow estimation and the principles behind them. IJCV, 106(2), 115–137.

    Article  Google Scholar 

  • Sun, J., Zheng, N., & Shum, H. Y. (2003). Stereo matching using belief propagation. PAMI, 25(7), 787–800.

    Article  MATH  Google Scholar 

  • Tomasi, C., & Manduchi, R. (1998). Bilateral filtering for gray and color images. In ICCV (pp. 839–846).

  • Weinzaepfel, P., Revaud, J., Harchaoui, Z., & Schmid, C. (Dec. 2013). Deepflow: Large displacement optical flow with deep matching. In ICCV, Sydney.

  • Xiong, W., Chung, H., & Jia, J. (2009). Fractional stereo matching using expectation-maximization. PAMI, 31(3), 428–443.

    Article  Google Scholar 

  • Yang, Q. (2012). A non-local cost aggregation method for stereo matching. In CVPR (pp. 1402–1409).

  • Yang, Q. (2012). Recursive bilateral filtering. In ECCV (pp. 399–413).

  • Yang, H., Lin, W., & Lu, J. (2014). Daisy filter flow: A generalized discrete approach to dense correspondences. In CVPR (pp. 3406–3413).

  • Yang, Q., Tan, K.-H., & Ahuja, N. (2009). Real-time o(1) bilateral filtering. In CVPR.

  • Ye, J., Ji, Y., & Yu, J. (2013). A rotational stereo model based on xslit imaging. In ICCV.

  • Ye, J., Ji, Y., Li, F., & Yu, J. (2012). Angular domain reconstruction of dynamic 3d fluid surfaces. In CVPR (pp. 310–317).

  • Yoon, K.-J., & Kweon, I.-S. (2006). Adaptive support-weight approach for correspondence search. PAMI, 28(4), 650–656.

    Article  Google Scholar 

  • Yu, Z., Guo, X., Ling, H., & Yu, J. (2013). Line assisted light field triangulation and stereo matching. In ICCV.

  • Zabih, R., & Woodfill, J. (1994). Non-parametric local transforms for computing visual correspondence. In ECCV.

  • Zhang, Z. (2012). Microsoft kinect sensor and its effect. IEEE MultiMedia, 19(2), 4–12.

    Article  Google Scholar 

  • Zhou, C., Troccoli, A., & Pulli, K. (2012). Robust stereo with flash and no-flash image pairs. In CVPR (pp. 342–349).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qingxiong Yang.

Additional information

Communicated by Long Quan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, J., Yang, Q. & Feng, Z. Occlusion-Aware Stereo Matching. Int J Comput Vis 120, 256–271 (2016). https://doi.org/10.1007/s11263-016-0910-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-016-0910-9

Keywords

Navigation