Filter-Based Mean-Field Inference for Random Fields with Higher-Order Terms and Product Label-Spaces

Vineet, Vibhav; Warrell, Jonathan; Torr, Philip H. S.

doi:10.1007/s11263-014-0708-6

Filter-Based Mean-Field Inference for Random Fields with Higher-Order Terms and Product Label-Spaces

Published: 19 March 2014

Volume 110, pages 290–307, (2014)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Vibhav Vineet¹,
Jonathan Warrell² &
Philip H. S. Torr³

1026 Accesses
48 Citations
3 Altmetric
Explore all metrics

Abstract

Recently, a number of cross bilateral filtering methods have been proposed for solving multi-label problems in computer vision, such as stereo, optical flow and object class segmentation that show an order of magnitude improvement in speed over previous methods. These methods have achieved good results despite using models with only unary and/or pairwise terms. However, previous work has shown the value of using models with higher-order terms e.g. to represent label consistency over large regions, or global co-occurrence relations. We show how these higher-order terms can be formulated such that filter-based inference remains possible. We demonstrate our techniques on joint stereo and object labelling problems, as well as object class segmentation, showing in addition for joint object-stereo labelling how our method provides an efficient approach to inference in product label-spaces. We show that we are able to speed up inference in these models around 10–30 times with respect to competing graph-cut/move-making methods, as well as maintaining or improving accuracy in all cases. We show results on PascalVOC-10 for object class segmentation, and Leuven for joint object-stereo labelling.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Non-parametric Higher-Order Random Fields for Image Segmentation

Normalized Cut Meets MRF

PoseField: An Efficient Mean-Field Based Method for Joint Estimation of Human Pose, Segmentation, and Depth

Notes

For exact MPM inference, the solution satisfies \(x^{{{\mathrm{MPM}}}}_i \in {{\mathrm{argmax}}}_l \sum _{\{\mathbf {x}|x_i=l\}}P(\mathbf {x}|I)\).
Although the updates are conceptually parallel in form, the permutohedral lattice convolution is implemented sequentially.
The class of such sparse higher-order potentials is also considered in Rother et al. (2009).
Equation 9 requires evaluation of the joint probability of \(c-1\) variable assignments for each of the \(|\mathcal {P}_c|\) patterns, leading to the complexity \(O(|\mathcal {P}_c||c|)\) for a single evaluation. If \(Q\) is prevented from taking the values \(0\) and \(1\), the joint pattern probabilities \(\prod _{j\in c}Q_j(x_j=p_j)\) can be calculated once for each clique, and the conditional forms \(\prod _{j\in c, j\ne i}Q_j(x_j=p_j)\) needed for parallel updates can then be derived by dividing by \(Q_i(x_i=p_i)\), leading to the overall \(O(\max _c(|\mathcal {P}_c||c|)|\mathcal {C}^{{{\mathrm{pat}}}}|)\) complexity.
In fact we use slightly different co-occurrence potentials with graph-cuts and mean-field, since for graph-cuts we use \(\psi ^{{{\mathrm{cooc}}}}\) while for mean-field we use \(\psi ^{{{\mathrm{cooc-2}}}}\), although we set the costs \(C(\Lambda )\) identically. We view the latter as an approximation of the former, and thus view this as a slight handicap for mean-field inference; however, further experiments would be needed to determine if the different forms of this potential lead to better/worse models.

References

Adams, A., Baek, J., & Davis, M. A. (2010). Fast high-dimensional filtering using the permutohedral lattice. Computer Graphics Forum, 29(2), 753–762.
Article Google Scholar
Bai, X. and Sapiro, G. (2007). A geodesic framework for fast interactive image and video segmentation and matting. In ICCV.
Bleyer, M., Rhemann , C. and Rother, C. (2012). Extracting 3D scene-consistent object proposals and depth from stereo images. In ECCV, (pp. 467–481).
Bleyer, M., Rother, C., Kohli, P., Scharstein, D. and Sinha, S. (2011). Object stereo - joint stereo matching and object segmentation. In CVPR, (pp. 3081–3088).
Borestein, E. and Malik, J. (2006). Shape guided object segmentation. In CVPR, (pp. 969–976).
Boykov, Y. and Jolly, M. (2001) Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In ICCV, (pp. 105–112).
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE PAMI, 23(11), 1222–1239.
Article Google Scholar
Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach towards feature space analysis. TPAMI, 24, 603–619.
Article Google Scholar
Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE PAMI, 24(5), 603–619.
Article Google Scholar
Criminisi, A. Sharp, T. and Blake, A. (2008). GeoS: Geodesic image segmentation. In ECCV, (pp. 99–112).
Everingham, M. Van Gool, L., Williams, C.K.I., Winn, J. and Zisserman, A. (2011). The PASCAL visual object classes, challenge (VOC2011).
Galleguillos, C. Rabinovich, A. and Belongiem, S. (2008). Object categorization using co-occurrence, location and appearance. In CVPR.
Gastla, E. S. S. L., & Oliveira, M. M. (2011). Domain transform for edge-aware image and video processing. ACM Transactions on Graphics, 30(4), 69.
Google Scholar
Goldlucke, B. and Cremers, D. (2010). Convex relaxation for multilabel problems with product label spaces. In ECCV, (pp. 225–238).
Gonfaus, J. M., Boix, X., Van De Weijer, J., Bagdanov, A. D., Serrat, J. and J. (2010). Gonzalez. Harmony potentials for joint classification and segmentation. In IEEE CVPR.
Grady, L. (2006). Random walks for image segmentation. TPAMI, 28, 1768–1783.
Article Google Scholar
Kohli, P., Kumar, M.P. and Torr, P.H.S. (2007). P3 & beyond: Solving energies with higher order cliques. In IEEE CVPR.
Koller, D., & Friedman, N. (2009). Probabilistic graphical models. London: MIT Press.
MATH Google Scholar
Kolmogorov, V. (2006). Convergent tree-reweighted message passing for energy minimization. IEEE PAMI, 28(10), 1568–1583.
Article Google Scholar
Komodakis, N. and Paragios, N. (2009). Beyond pairwise energies: Efficient optimization for higher-order MRFs. In IEEE CVPR, (pp. 2985–2992).
Komodakis, N., Paragios, N., & Tziritas, G. (2011). MRF energy minimization and beyond via dual decomposition. IEEE PAMI, 33(3), 531–552.
Article Google Scholar
Kornprobst, P., Tumblin, J., & Durand, F. (2009). Bilateral filtering: Theory and applications. Foundations and Trends in Computer Graphics and Vision, 4(1), 1–74.
Google Scholar
Krahenbuhl . P. and Koltun, V. (2011). Efficient inference in fully connected CRFs with gaussian edge potentials. In NIPS, (pp. 109–117).
Kumar, M., Torr, P. and Zisserman, A. (2005). Obj cut. In CVPR, (pp. 18–25).
Kumar, M. P., Veksler, O., & Torr, P. H. S. (2011). Improved moves for truncated convex models. JMLR, 12, 31–67.
MATH Google Scholar
Ladický, L., Russell, C., Kohli, P. and Torr, P.H.S. (2009). Assiciative hierarchical CRFs for object class image segmentation. In ICCV, (pp. 739–746).
Ladický, L., Russell, C., Kohli, P. and Torr, P.H.S. (2010). Graph cut based inference with co-occurrence statistics. In ECCV, (pp. 239–253).
Ladický, L., Sturgess, P., Alahari, K., Russell, C. and Torr, P.H.S. (2010). What, where and how many? combining object detectors and crfs. In ECCV.
Ladický, L., Sturgess, P., Russell, C., Sengupta, S., Bastanlar, Y., Clocksin, W.F. and Torr, P.H.S. (2010). Joint optimisation for object class segmentation and dense stereo reconstruction. In BMVC, (pp. 1–11).
Lan, X., Roth, S., Huttenlocker, D. and Black, M. (2009). Efficient belief propagation with learnerd higher-order markov random fields. In ECCV, (pp. 269–283).
Liu, C., Yuen, J., Torralba, A., Sivic, J. and Freeman, W.T. (2008). SIFT flow: Dense correspondence across different scenes. In ECCV.
Liu, C., Yuen, J. and Torralba, A. (2009). Nonparametric scene parsing: Label transfer via dense scene alignment. In CVPR.
Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV, 42, 145–175.
Google Scholar
Pawan Kumar, M. and Torr, Philip H.S. (2008). Improved moves for truncated convex models. In NIPS, (pp. 889–896).
Payet, N. and Todorovic, S. (2010). (\(\text{ RF })^2\)-random forest random field. In NIPS.
Potetz, B., & Lee, T. S. (2008). Efficient belief propagation for higher-order cliques using linear constraint nodes. CVIU, 112, 39–54.
Google Scholar
Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E. and Belongie, S. (2007). Objects in context. In ICCV.
Rhemann, C., Hosni, A., Bleyer, M., Rother, C. and Gelautz. M. (2011). Fast cost-volume filtering for visual correspondence and beyond. In CVPR, (pp. 3017–3024).
Rother, C., Kohli, P., Feng, W. and Jia, J. (2009). Minimizing sparse higher order energy functions of discrete variables. In CVPR, (pp. 1382–1389).
Rother, C., Kohli, P., Feng, W. and Jia, J. (2009). Minimizing sparse higher order energy functions of discrete variables. In CVPR .
Rother, C., Kolmogorov, V., & Blake, A. (2004). Grabcut: Interactive foreground extraction using iterated graph cuts. ACM TOG, 23, 309–314.
Article Google Scholar
Shotton, J., Winn, J. M., Rother, C., & Criminisi, A. (2009). Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. IJCV, 81(1), 2–23.
Article Google Scholar
Singaraju, D., Grady, L. and Vidal R. (2008). P-Brush: Continuous valued MRFs with normed pairwise distributions for image segmentation. In CVPR.
Torralba, A., Murphy, K. P., & Freeman, W. T. (2007). Sharing visual features for multiclass and multiview object detection. IEEE PAMI, 29, 854–869.
Article Google Scholar
Toyoda, T., & Hasegawa, O. (2008). Random field model for integration of local information and global information. TPAMI, 30, 1483–1489.
Article Google Scholar
Turner, R. E. and Sahani, M. (2011). Two problems with variational expectation maximisation for time-series models. In Bayesian time series models, (pp. 109–130).
Veksler, O. (2007). Graph cut based optimization for MRFs with truncated convex priors. In CVPR .
Weiss, Y. (2001). Comparing the mean field method and belief propagation for approximate inference in MRFs. Advanced mean field methods: Theory and practices. Cambridge, MA: MIT Press.
Google Scholar
Woodford, O., Torr, P. H. S., Reid, I., & Fitzgibbon, A. (2009). Global stereo reconstruction under second-order smoothness priors. IEEE PAMI, 31(12), 2115–2128.
Article Google Scholar

Download references

Acknowledgments

We thank Paul Sturgess for his discussion on SIFT-flow based initialization. The work was supported by the EPSRC and the IST programme of the European Community, under the PASCAL2 Network of Excellence. Professor Philip H.S. Torr is in receipt of a Royal Society Wolfson Research Merit Award.

Author information

Authors and Affiliations

Oxford Brookes University, Oxford, UK
Vibhav Vineet
MIAS (CSIR), Pretoria, South Africa
Jonathan Warrell
Department of Engineering Science, University of Oxford, Oxford, UK
Philip H. S. Torr

Authors

Vibhav Vineet
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan Warrell
View author publications
You can also search for this author in PubMed Google Scholar
Philip H. S. Torr
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vibhav Vineet.

Additional information

Communicated by Carlo Colombo.

Vibhav Vineet and Jonathan Warrell have contributed to this work equally as joint first author.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vineet, V., Warrell, J. & Torr, P.H.S. Filter-Based Mean-Field Inference for Random Fields with Higher-Order Terms and Product Label-Spaces. Int J Comput Vis 110, 290–307 (2014). https://doi.org/10.1007/s11263-014-0708-6

Download citation

Received: 06 June 2013
Accepted: 25 February 2014
Published: 19 March 2014
Issue Date: December 2014
DOI: https://doi.org/10.1007/s11263-014-0708-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Filter-Based Mean-Field Inference for Random Fields with Higher-Order Terms and Product Label-Spaces

Abstract

Access this article

Similar content being viewed by others

Non-parametric Higher-Order Random Fields for Image Segmentation

Normalized Cut Meets MRF

PoseField: An Efficient Mean-Field Based Method for Joint Estimation of Human Pose, Segmentation, and Depth

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Filter-Based Mean-Field Inference for Random Fields with Higher-Order Terms and Product Label-Spaces

Abstract

Access this article

Similar content being viewed by others

Non-parametric Higher-Order Random Fields for Image Segmentation

Normalized Cut Meets MRF

PoseField: An Efficient Mean-Field Based Method for Joint Estimation of Human Pose, Segmentation, and Depth

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation