International Journal of Computer Vision

, Volume 99, Issue 3, pp 319–337

On Learning Conditional Random Fields for Stereo

Exploring Model Structures and Approximate Inference
  • Christopher J. Pal
  • Jerod J. Weinman
  • Lam C. Tran
  • Daniel Scharstein
Article
  • 821 Downloads

Abstract

Until recently, the lack of ground truth data has hindered the application of discriminative structured prediction techniques to the stereo problem. In this paper we use ground truth data sets that we have recently constructed to explore different model structures and parameter learning techniques. To estimate parameters in Markov random fields (MRFs) via maximum likelihood one usually needs to perform approximate probabilistic inference. Conditional random fields (CRFs) are discriminative versions of traditional MRFs. We explore a number of novel CRF model structures including a CRF for stereo matching with an explicit occlusion model. CRFs require expensive inference steps for each iteration of optimization and inference is particularly slow when there are many discrete states. We explore belief propagation, variational message passing and graph cuts as inference methods during learning and compare with learning via pseudolikelihood. To accelerate approximate inference we have developed a new method called sparse variational message passing which can reduce inference time by an order of magnitude with negligible loss in quality. Learning using sparse variational message passing improves upon previous approaches using graph cuts and allows efficient learning over large data sets when energy functions violate the constraints imposed by graph cuts.

Keywords

Stereo Learning Structured prediction Approximate inference 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alvarez, L., Deriche, R., Snchez, J., & Weickert, J. (2002). Dense disparity map estimation respecting image discontinuities: a PDE and scale-space based approach. Journal of Visual Communication and Image Representation, 13(1–2), 3–21. CrossRefGoogle Scholar
  2. Andrieu, C., de Freitas, N., Doucet, A., & Jordan, M. (2003). An introduction to MCMC for machine learning. Machine Learning, 50, 5–43. MATHCrossRefGoogle Scholar
  3. Barnard, S. (1989). Stochastic stereo matching over scale. International Journal of Computer Vision, 3(1), 17–32. MathSciNetCrossRefGoogle Scholar
  4. Birchfield, S., & Tomasi, C. (1998). A pixel dissimilarity measure that is insensitive to image sampling. IEEE TPAMI, 20(4), 401–406. CrossRefGoogle Scholar
  5. Blake, A., Rother, C., Brown, M., Perez, P., & Torr, P. (2004). Interactive image segmentation using an adaptive GMMRF model. In Proc. ECCV (pp. 428–441) Google Scholar
  6. Blei, D., Ng, A., & Jordan, M. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022. MATHGoogle Scholar
  7. Bleyer, M., & Gelautz, M. (2004). A layered stereo algorithm using image segmentation and global visibility constraints. In Proc. ICIP. Google Scholar
  8. Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE TPAMI, 23(11), 1222–1239. CrossRefGoogle Scholar
  9. Cowell, R., Dawid, A., Lauritzen, S., & Spiegelhalter, D. (2003). Probabilistic Networks and Expert Systems. Berlin: Springer. Google Scholar
  10. Della Pietra, S., Della Pietra, V., & Lafferty, J. (1997). Inducing features of random fields. IEEE TPAMI, 19, 380–393. CrossRefGoogle Scholar
  11. Felzenszwalb, P., & Huttenlocher, D. (2006). Efficient belief propagation for early vision. International Journal of Computer Vision, 70(1), 41–54. CrossRefGoogle Scholar
  12. Frey, B., & Jojic, N. (2005). A comparison of algorithms for inference and learning in probabilistic graphical models. IEEE TPAMI, 27(9), 1392–1416. CrossRefGoogle Scholar
  13. Frey, B., & MacKay, D. (1997). A revolution: Belief propagation in graphs with cycles. In Proc NIPS. Google Scholar
  14. He, Z., Zemel, R., & Carreira-Perpinan, M. (2004). Multiscale conditional random fields for image labeling. In Proc. CVPR (pp. 695–702). Google Scholar
  15. Hong, L., & Chen, G. (2004). Segment-based stereo matching using graph cuts. In Proc. CVPR (Vol. I, pp. 74–81). Google Scholar
  16. Jordan, M., Ghahramani, Z., Jaakkola, T., & Saul, L. (1999). Introduction to variational methods for graphical models. Machine Learning, 37, 183–233. MATHCrossRefGoogle Scholar
  17. Kolmogorov, V. (2006). Convergent tree-reweighted message passing for energy minimization. IEEE TPAMI, 28, 1568–1583. CrossRefGoogle Scholar
  18. Kolmogorov, V., & Zabih, R. (2001). Computing visual correspondence with occlusions using graph cuts. In Proc. ICCV (pp. 508–515). Google Scholar
  19. Kolmogorov, V., & Zabih, R. (2002a). Multi-camera scene reconstruction via graph cuts. In Proc. ECCV (Vol. III, pp. 82–96). Google Scholar
  20. Kolmogorov, V., & Zabih, R. (2002b). What energy functions can be minimized via graph cuts? In Proc. ECCV (Vol. III, pp. 65–81). Google Scholar
  21. Kolmogorov, V., Criminisi, A., Blake, A., Cross, G., & Rother, C. (2006). Probabilistic fusion of stereo with color and contrast for bilayer segmentation. IEEE TPAMI, 28(9), 1480–1492. CrossRefGoogle Scholar
  22. Kong, D., & Tao, H. (2004). A method for learning matching errors in stereo computation. In Proc. BMVC. Google Scholar
  23. Kschischang, F., Frey, B., & Loeliger, H.A. (2001). Factor graphs and the sum-product algorithm. IEEE Transactions Info Theory, 47(2), 498–519. MathSciNetMATHCrossRefGoogle Scholar
  24. Kumar, S., & Hebert, M. (2006). Discriminative random fields. International Journal of Computer Vision, 68(2), 179–201. CrossRefGoogle Scholar
  25. Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. ICML (pp. 282–289). Google Scholar
  26. Liang, P., & Jordan, M. (2008). An asymptotic analysis of generative, discriminative and pseudolikelihood estimators. In Proc ICML. Google Scholar
  27. Murphy, K., Weiss, Y., & Jordan, M. (1999). Loopy belief propagation for approximate inference: An empirical study. In Proc. UAI (pp. 467–475). Google Scholar
  28. Ng, A. Y., & Jordan, M. (2002). On discriminative vs. generative classifiers: A comparison of logistic regression and naive Bayes. In Proc. NIPS. Google Scholar
  29. Pal, C., Sutton, C., & McCallum, A. (2006). Sparse forward-backward using minimum divergence beams for fast training of conditional random fields. In Proc. ICASSP (pp. 581–584). Google Scholar
  30. Scharstein, D., & Pal, C. (2007). Learning conditional random fields for stereo. In Proc. CVPR. Google Scholar
  31. Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 47(1), 7–42. MATHCrossRefGoogle Scholar
  32. Scharstein, D., & Szeliski, R. (2003). High-accuracy stereo depth maps using structured light. In Proc. CVPR (Vol. I, pp. 195–202). Google Scholar
  33. Strecha, C., Tuytelaars, T., & Van Gool, L. (2003). Dense matching of multiple wide-baseline views. In Proc. CVPR (Vol. 2, p. 1194). Google Scholar
  34. Strecha, C., Fransens, R., & Van Gool, L. (2004). Wide-baseline stereo from multiple views: A probabilistic account. In Proc. CVPR (Vol. 1, pp. 552–559). Google Scholar
  35. Sun, J., Zheng, N., & Shum, H. (2003). Stereo matching using belief propagation. IEEE TPAMI, 25(7), 787–800. CrossRefGoogle Scholar
  36. Sun, J., Li, Y., Kang, S. B., & Shum, H. Y. (2005). Symmetric stereo matching for occlusion handling. In Proc. CVPR (pp. 399–406). Google Scholar
  37. Sutton, C., & McCallum, A. (2006). An introduction to conditional random fields for relational learning. In L. Getoor & B. Taskar (Eds.), Introduction to Statistical Relational Learning. Cambridge: MIT Press. Google Scholar
  38. Sutton, C., Rohanimanesh, K., & McCallum, A. (2004). Dynamic conditional random fields: Factorized probabilistic models for labeling and segmenting sequence data. In Proc. ICML. Google Scholar
  39. Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., Tappen, M., & Rother, C. (2008). A comparative study of energy minimization methods for Markov random fields with smoothness-based priors. IEEE TPAMI, 30, 1068–1080. CrossRefGoogle Scholar
  40. Tao, H., Sawhney, H., & Kumar, R. (2001). A global matching framework for stereo computation. In Proc. ICCV (Vol. I, pp. 532–539). Google Scholar
  41. Tappen, M., & Freeman, W. (2003). Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters. In Proc. ICCV (pp. 900–907). Google Scholar
  42. Trinh, H., & McAllester, D. (2009). Unsupervised learning of stereo vision with monocular cues. In Proc. BMVC. Google Scholar
  43. Vishwanathan, S., Schraudolph, N., Schmidt, M., & Murphy, K. (2006). Accelerated training of conditional random fields with stochastic gradient methods. In Proc. ICML (pp. 969–976). New York: ACM. Google Scholar
  44. Wainwright, M., Jaakkola, T., & Willsky, A. (2002). Tree-based reparameterization for approximate estimation on graphs with cycles. In Proc. NIPS. Google Scholar
  45. Wainwright, M., Jaakkola, T., & Willsky, A. (2003). Tree-based reparameterization framework for analysis of sum-product and related algorithms. IEEE Transaction on Information Theory, 45(9), 1120–1146. MathSciNetCrossRefGoogle Scholar
  46. Wainwright, M., Jaakkola, T., & Willsky, A. (2005). Map estimation via agreement on trees: Message-passing and linear programming. IEEE Transaction on Information Theory, 51(11), 3697–3717. MathSciNetCrossRefGoogle Scholar
  47. Wei, Y., & Quan, L. (2004). Region-based progressive stereo matching. In Proc. CVPR (vol. I, pp. 106–113). Google Scholar
  48. Weinman, J. J., Hanson, A., & McCallum, A. (2004). Sign detection in natural images with conditional random fields. In IEEE Int. Workshop on Machine Learning for Signal Processing (pp. 549–558). Google Scholar
  49. Weinman, J. J., Pal, C., & Scharstein, D. (2007). Sparse message passing and efficiently learning random fields for stereo vision. Tech. Rep. UM-CS-2007-054, Univ. of Massachusetts, Amherst. Google Scholar
  50. Weinman, J. J., Tran, L., & Pal, C. (2008). Efficiently learning random fields for stereo vision with sparse message passing. In Proc. ECCV. Google Scholar
  51. Weinman, J. J., Learned-Miller, E., & Hanson, A. (2009). Scene text recognition using similarity and a lexicon with sparse belief propagation. IEEE TPAMI, 31(10), 1733–1746. CrossRefGoogle Scholar
  52. Winn, J., & Bishop, C. (2005). Variational message passing. Journal of Machine Learning Research, 6, 661–694. MathSciNetMATHGoogle Scholar
  53. Yang, Q., Wang, L., Yang, R., Stewenius, H., & Nister, D. (2006). Stereo matching with color-weighted correlation, hierarchical belief propagation and occlusion handling. In Proc. CVPR. Google Scholar
  54. Yedidia, J., Freeman, W., & Weiss, Y. (2003). Understanding belief propagation and its generalizations. In Exploring Artificial Intelligence in the New Millennium (pp. 239–236). Google Scholar
  55. Zhang, L., & Seitz, S. (2005). Parameter estimation for MRF stereo. In Proc. CVPR (Vol. II, pp. 288–295). Google Scholar
  56. Zhang, Y., & Kambhamettu, C. (2002). Stereo matching with segmentation-based cooperation. In Proc. ECCV (Vol. II, pp. 556–571). Google Scholar
  57. Zitnick, L., Kang, S., Uyttendaele, M., Winder, S., & Szeliski, R. (2004). High-quality video view interpolation using a layered representation. SIGGRAPH, ACM Transactions on Graphics, 23(3), 600–608. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Christopher J. Pal
    • 1
  • Jerod J. Weinman
    • 2
  • Lam C. Tran
    • 3
  • Daniel Scharstein
    • 4
  1. 1.École Polytechnique de MontréalMontréalCanada
  2. 2.Dept. of Computer ScienceGrinnell CollegeGrinnellUSA
  3. 3.Dept. of Electrical and Computer EngineeringUniversity of California San DiegoSan DiegoUSA
  4. 4.Middlebury CollegeMiddleburyUSA

Personalised recommendations