Abstract
Machine learning has been instrumental in most areas of computer vision, but has not been applied to the problem of stereo matching with similar frequency or success. In this paper, we present a supervised learning approach by defining a set of features that capture various forms of information about each pixel, and then by using them to predict the correctness of stereo matches based on a random forest. We show highly competitive results in predicting the correctness of matches and in confidence estimation, which allows us to rank pixels according to the reliability of their assigned disparities. Moreover, we show how these confidence values can be used to improve the accuracy of disparity maps by integrating them with an MRF-based stereo algorithm. This is an important distinction from current literature that has mainly focused on sparsification by removing potentially erroneous disparities to generate quasi-dense disparity maps. Finally, we demonstrate domain generalization of our method by applying classifiers to datasets different than those they were trained on with minimal loss of accuracy.
Similar content being viewed by others
References
Alahari, K., Russell, C., & Torr, P. (2010). Efficient piecewise learning for conditional random fields. In: CVPR, pp. 895–901
Birchfield, S., & Tomasi, C. (1998). A pixel dissimilarity measure that is insensitive to image sampling. PAMI, 20(4), 401–406.
Bobick, A., & Intille, S. (1999). Large occlusion stereo. IJCV, 33(3), 1–20.
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. PAMI, 23(11), 1222–1239.
Breiman, L. (2001). Random forests. Machine Learning Journal, 45, 5–32.
Brown, M., Burschka, D., & Hager, G. (2003). Advances in computational stereo. PAMI, 25(8), 993–1008.
Criminisi, A., & Shotton, J. (2013). Decision forests for computer vision and medical image analysis. New York: Springer.
Cruz, J., Pajares, G., Aranda, J., & Vindel, J. (1995). Stereo matching technique based on the perceptron criterion function. Pattern Recognition Letters, 16(9), 933–944.
Geiger, A., Lenz, P., Stiller, C., & Urtasun, R. (2013). Vision meets robotics: The kitti dataset. International Journal of Robotics Research (IJRR)
Haeusler, R., & Klette, R. (2012). Analysis of kitti data for stereo analysis with stereo confidence measures. In: ECCV Workshops, pp. II: 158–167
Haeusler, R., Nair, R., & Kondermann, D. (2013) Ensemble learning for confidence measures in stereo vision. In: CVPR
Hirschmüller, H. (2008). Stereo processing by semiglobal matching and mutual information. PAMI, 30(2), 328–341.
Hu, X., & Mordohai, P. (2012). A quantitative evaluation of confidence measures for stereo vision. PAMI, 34(11), 2121–2133.
Kim, J.C., Lee, K.M., Choi, B.T., & Lee, S.U. (2005). A dense stereo matching using two-pass dynamic programming with generalized ground control points. In: CVPR, pp. 1075–1082
Komodakis, N., Tziritas, G., Paragios, N.: Fast, approximately optimal solutions for single and dynamic MRFs. In: CVPR (2007)
Kong, D., & Tao, H. (2004). A method for learning matching errors for stereo computation. In: BMVC
Kong, D., & Tao, H. (2006). Stereo matching via learning multiple experts behaviors. In: BMVC
Lew, M., Huang, T., & Wong, K. (1994). Learning and feature selection in stereo matching. PAMI, 16(9), 869–881.
Li, Y., & Huttenlocher, D. (2008). Learning for stereo vision using the structured support vector machine. In: CVPR
Mac Aodha, O., Humayun, A., Pollefeys, M., & Brostow, G. J. (2012). Learning a confidence measure for optical flow. PAMI, 35(5), 1107–1120.
Merrell, P., Akbarzadeh, A., Wang, L., Mordohai, P., Frahm, J.M., Yang, R., Nistér, D., Pollefeys, M.: Real-time visibility-based fusion of depth maps. In: ICCV (2007)
Motten, A., Claesen, L., & Pan, Y. (2012). Trinocular disparity processor using a hierarchic classification structure. In: IEEE/IFIP International Conference on VLSI and System-on-Chip
Pal, C., Weinman, J., Tran, L., & Scharstein, D. (2012). On learning conditional random fields for stereo: Exploring model structures and approximate inference. IJCV, 99(3), 319–337.
Park, M.G., & Yoon, K.J. (2015). Leveraging stereo matching with learning-based confidence measures. In: CVPR, pp. 101–109
Pfeiffer, D., Gehrig, S., & Schneider, N. (2013). Exploiting the power of stereo confidences. In: CVPR, pp. 297–304
Reynolds, M., Dobos, J., Peel, L., Weyrich, T., Brostow, G.: Capturing time-of-flight data with confidence. In: CVPR, pp. 945–952 (2011)
Sabater, N., Almansa, A., & Morel, J. (2012). Meaningful matches in stereovision. PAMI, 34(5), 930–942.
Scharstein, D., Pal, C.: Learning conditional random fields for stereo. In: CVPR (2007)
Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV, 47(1–3), 7–42.
Spyropoulos, A., Komodakis, N., & Mordohai, P. (2014). Learning to detect ground control points for improving the accuracy of stereo matching. In: CVPR, pp. 1621–1628
Sun, X., Mei, X., Jiao, S., Zhou, M., Wang, H.: Stereo matching with reliable disparity propagation. In: 3DIMPVT, pp. 132–139 (2011)
Trinh, H., McAllester, D.: Unsupervised learning of stereo vision with monocular depth cues. In: BMVC (2009)
Wang, L., Yang, R.: Global stereo matching leveraged by sparse ground control points. In: CVPR, pp. 3033–3040 (2011)
Yamaguchi, K., Hazan, T., McAllester, D., Urtasun, R.: Continuous markov random fields for robust stereo estimation. In: ECCV, pp. V: 45–58 (2012)
Yoon, K., & Kweon, I. (2006). Adaptive support-weight approach for correspondence search. PAMI, 28(4), 650–656.
Zagoruyko, S., Komodakis, N.: Learning to compare image patches via convolutional neural networks. In: CVPR (2015)
Zbontar, J., LeCun, Y.: Computing the stereo matching cost with a convolutional neural network. In: CVPR (2015)
Zhang, K., Lu, J., & Lafruit, G. (2009). Cross-based local stereo matching using orthogonal integral images. IEEE Transactions on Circuits and Systems for Video Technology, 19(7), 1073–1079.
Zhang, L., & Seitz, S. (2007). Estimating optimal parameters for mrf stereo from a single image pair. PAMI, 29(2), 331–342.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by O. Veksler.
Rights and permissions
About this article
Cite this article
Spyropoulos, A., Mordohai, P. Correctness Prediction, Accuracy Improvement and Generalization of Stereo Matching Using Supervised Learning. Int J Comput Vis 118, 300–318 (2016). https://doi.org/10.1007/s11263-015-0877-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-015-0877-y