Skip to main content
Log in

Correctness Prediction, Accuracy Improvement and Generalization of Stereo Matching Using Supervised Learning

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Machine learning has been instrumental in most areas of computer vision, but has not been applied to the problem of stereo matching with similar frequency or success. In this paper, we present a supervised learning approach by defining a set of features that capture various forms of information about each pixel, and then by using them to predict the correctness of stereo matches based on a random forest. We show highly competitive results in predicting the correctness of matches and in confidence estimation, which allows us to rank pixels according to the reliability of their assigned disparities. Moreover, we show how these confidence values can be used to improve the accuracy of disparity maps by integrating them with an MRF-based stereo algorithm. This is an important distinction from current literature that has mainly focused on sparsification by removing potentially erroneous disparities to generate quasi-dense disparity maps. Finally, we demonstrate domain generalization of our method by applying classifiers to datasets different than those they were trained on with minimal loss of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  • Alahari, K., Russell, C., & Torr, P. (2010). Efficient piecewise learning for conditional random fields. In: CVPR, pp. 895–901

  • Birchfield, S., & Tomasi, C. (1998). A pixel dissimilarity measure that is insensitive to image sampling. PAMI, 20(4), 401–406.

    Article  Google Scholar 

  • Bobick, A., & Intille, S. (1999). Large occlusion stereo. IJCV, 33(3), 1–20.

    Google Scholar 

  • Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. PAMI, 23(11), 1222–1239.

    Article  Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning Journal, 45, 5–32.

    Article  Google Scholar 

  • Brown, M., Burschka, D., & Hager, G. (2003). Advances in computational stereo. PAMI, 25(8), 993–1008.

    Google Scholar 

  • Criminisi, A., & Shotton, J. (2013). Decision forests for computer vision and medical image analysis. New York: Springer.

    Book  Google Scholar 

  • Cruz, J., Pajares, G., Aranda, J., & Vindel, J. (1995). Stereo matching technique based on the perceptron criterion function. Pattern Recognition Letters, 16(9), 933–944.

    Article  Google Scholar 

  • Geiger, A., Lenz, P., Stiller, C., & Urtasun, R. (2013). Vision meets robotics: The kitti dataset. International Journal of Robotics Research (IJRR)

  • Haeusler, R., & Klette, R. (2012). Analysis of kitti data for stereo analysis with stereo confidence measures. In: ECCV Workshops, pp. II: 158–167

  • Haeusler, R., Nair, R., & Kondermann, D. (2013) Ensemble learning for confidence measures in stereo vision. In: CVPR

  • Hirschmüller, H. (2008). Stereo processing by semiglobal matching and mutual information. PAMI, 30(2), 328–341.

    Article  Google Scholar 

  • Hu, X., & Mordohai, P. (2012). A quantitative evaluation of confidence measures for stereo vision. PAMI, 34(11), 2121–2133.

    Article  Google Scholar 

  • Kim, J.C., Lee, K.M., Choi, B.T., & Lee, S.U. (2005). A dense stereo matching using two-pass dynamic programming with generalized ground control points. In: CVPR, pp. 1075–1082

  • Komodakis, N., Tziritas, G., Paragios, N.: Fast, approximately optimal solutions for single and dynamic MRFs. In: CVPR (2007)

  • Kong, D., & Tao, H. (2004). A method for learning matching errors for stereo computation. In: BMVC

  • Kong, D., & Tao, H. (2006). Stereo matching via learning multiple experts behaviors. In: BMVC

  • Lew, M., Huang, T., & Wong, K. (1994). Learning and feature selection in stereo matching. PAMI, 16(9), 869–881.

    Article  Google Scholar 

  • Li, Y., & Huttenlocher, D. (2008). Learning for stereo vision using the structured support vector machine. In: CVPR

  • Mac Aodha, O., Humayun, A., Pollefeys, M., & Brostow, G. J. (2012). Learning a confidence measure for optical flow. PAMI, 35(5), 1107–1120.

    Article  Google Scholar 

  • Merrell, P., Akbarzadeh, A., Wang, L., Mordohai, P., Frahm, J.M., Yang, R., Nistér, D., Pollefeys, M.: Real-time visibility-based fusion of depth maps. In: ICCV (2007)

  • Motten, A., Claesen, L., & Pan, Y. (2012). Trinocular disparity processor using a hierarchic classification structure. In: IEEE/IFIP International Conference on VLSI and System-on-Chip

  • Pal, C., Weinman, J., Tran, L., & Scharstein, D. (2012). On learning conditional random fields for stereo: Exploring model structures and approximate inference. IJCV, 99(3), 319–337.

    Article  MathSciNet  MATH  Google Scholar 

  • Park, M.G., & Yoon, K.J. (2015). Leveraging stereo matching with learning-based confidence measures. In: CVPR, pp. 101–109

  • Pfeiffer, D., Gehrig, S., & Schneider, N. (2013). Exploiting the power of stereo confidences. In: CVPR, pp. 297–304

  • Reynolds, M., Dobos, J., Peel, L., Weyrich, T., Brostow, G.: Capturing time-of-flight data with confidence. In: CVPR, pp. 945–952 (2011)

  • Sabater, N., Almansa, A., & Morel, J. (2012). Meaningful matches in stereovision. PAMI, 34(5), 930–942.

    Article  Google Scholar 

  • Scharstein, D., Pal, C.: Learning conditional random fields for stereo. In: CVPR (2007)

  • Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV, 47(1–3), 7–42.

  • Spyropoulos, A., Komodakis, N., & Mordohai, P. (2014). Learning to detect ground control points for improving the accuracy of stereo matching. In: CVPR, pp. 1621–1628

  • Sun, X., Mei, X., Jiao, S., Zhou, M., Wang, H.: Stereo matching with reliable disparity propagation. In: 3DIMPVT, pp. 132–139 (2011)

  • Trinh, H., McAllester, D.: Unsupervised learning of stereo vision with monocular depth cues. In: BMVC (2009)

  • Wang, L., Yang, R.: Global stereo matching leveraged by sparse ground control points. In: CVPR, pp. 3033–3040 (2011)

  • Yamaguchi, K., Hazan, T., McAllester, D., Urtasun, R.: Continuous markov random fields for robust stereo estimation. In: ECCV, pp. V: 45–58 (2012)

  • Yoon, K., & Kweon, I. (2006). Adaptive support-weight approach for correspondence search. PAMI, 28(4), 650–656.

    Article  Google Scholar 

  • Zagoruyko, S., Komodakis, N.: Learning to compare image patches via convolutional neural networks. In: CVPR (2015)

  • Zbontar, J., LeCun, Y.: Computing the stereo matching cost with a convolutional neural network. In: CVPR (2015)

  • Zhang, K., Lu, J., & Lafruit, G. (2009). Cross-based local stereo matching using orthogonal integral images. IEEE Transactions on Circuits and Systems for Video Technology, 19(7), 1073–1079.

    Article  Google Scholar 

  • Zhang, L., & Seitz, S. (2007). Estimating optimal parameters for mrf stereo from a single image pair. PAMI, 29(2), 331–342.

    Article  Google Scholar 

Download references

Acknowledgments

The authors are grateful to Haeusler et al. (2013), Park and Yoon (2015) and Wang and Yang (2011) for sharing data and providing guidance on how to implement their algorithms. This research has been supported in part by the National Science Foundation Awards #1217797 and #1527294.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aristotle Spyropoulos.

Additional information

Communicated by O. Veksler.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Spyropoulos, A., Mordohai, P. Correctness Prediction, Accuracy Improvement and Generalization of Stereo Matching Using Supervised Learning. Int J Comput Vis 118, 300–318 (2016). https://doi.org/10.1007/s11263-015-0877-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-015-0877-y

Keywords

Navigation