Robust Higher Order Potentials for Enforcing Label Consistency

Kohli, Pushmeet; Ladický, L’ubor; Torr, Philip H. S.

doi:10.1007/s11263-008-0202-0

Robust Higher Order Potentials for Enforcing Label Consistency

Published: 24 January 2009

Volume 82, pages 302–324, (2009)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Pushmeet Kohli¹,
L’ubor Ladický² &
Philip H. S. Torr²

1806 Accesses
493 Citations
6 Altmetric
Explore all metrics

Abstract

This paper proposes a novel framework for labelling problems which is able to combine multiple segmentations in a principled manner. Our method is based on higher order conditional random fields and uses potentials defined on sets of pixels (image segments) generated using unsupervised segmentation algorithms. These potentials enforce label consistency in image regions and can be seen as a generalization of the commonly used pairwise contrast sensitive smoothness potentials. The higher order potential functions used in our framework take the form of the Robust P ⁿ model and are more general than the P ⁿ Potts model recently proposed by Kohli et al. We prove that the optimal swap and expansion moves for energy functions composed of these potentials can be computed by solving a st-mincut problem. This enables the use of powerful graph cut based move making algorithms for performing inference in the framework. We test our method on the problem of multi-class object segmentation by augmenting the conventional crf used for object segmentation with higher order potentials defined on image regions. Experiments on challenging data sets show that integration of higher order potentials quantitatively and qualitatively improves results leading to much better definition of object boundaries. We believe that this method can be used to yield similar improvements for many other labelling problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Alahari, K., Kohli, P., & Torr, P. (2008). Reduce, reuse and recycle: efficiently solving multi-label MRFs. In IEEE conference on computer vision and pattern recognition.
Blake, A., Rother, C., Brown, M., Perez, P., & Torr, P. (2004). Interactive image segmentation using an adaptive GMMRF model. In European conference on computer vision (pp. I: 428–441).
Borenstein, E., & Malik, J. (2006). Shape guided object segmentation. In IEEE conference on computer vision and pattern recognition (pp. 969–976).
Boros, E., & Hammer, P. (2002). Pseudo-boolean optimization. Discrete Applied Mathematics, 123(1–3), 155–225.
MATH MathSciNet Google Scholar
Boykov, Y., & Jolly, M. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In International conference on computer vision (pp. I: 105–112).
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.
Article Google Scholar
Bray, M., Kohli, P., & Torr, P. (2006). Posecut: Simultaneous segmentation and 3d pose estimation of humans using dynamic graph-cuts. In European conference on computer vision (pp. 642–655).
Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 603–619.
Article Google Scholar
Felzenszwalb, P., & Huttenlocher, D. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181.
Article Google Scholar
Flach, B. (2002). Strukturelle bilderkennung (Tech. Rep.). Universit at Dresden.
Freedman, D., & Drineas, P. (2005). Energy minimization via graph cuts: Settling what is possible. In IEEE conference on computer vision and pattern recognition (pp. 939–946).
Fujishige, S. (1991). Submodular functions and optimization. Amsterdam: North-Holland.
MATH Google Scholar
He, X., Zemel, R., & Carreira-Perpiñán, M. (2004). Multiscale conditional random fields for image labeling. In IEEE conference on computer vision and pattern recognition (2) (pp. 695–702).
He, X., Zemel, R., & Ray, D. (2006). Learning and incorporating top-down cues in image segmentation. In European conference on computer vision (pp. 338–351).
Hoiem, D., Efros, A., & Hebert, M. (2005a). Automatic photo pop-up. ACM Transactions on Graphics, 24(3), 577–584.
Article Google Scholar
Hoiem, D., Efros, A., & Hebert, M. (2005b). Geometric context from a single image. In International conference on computer vision (pp. 654–661).
Huang, R., Pavlovic, V., & Metaxas, D. (2004). A graphical model framework for coupling MRFs and deformable models. In IEEE conference on computer vision and pattern recognition (Vol. 11, pp. 739–746).
Ishikawa, H. (2003). Exact optimization for Markov random fields with convex priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25, 1333–1336.
Article Google Scholar
Kohli, P., Kumar, M., & Torr, P. (2007). P ³ and beyond: solving energies with higher order cliques. In IEEE conference on computer vision and pattern recognition.
Kohli, P., Ladicky, L., & Torr, P. (2008). Robust higher order potentials for enforcing label consistency. In CVPR.
Kolmogorov, V. (2006). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1568–1583.
Article Google Scholar
Kolmogorov, V., & Zabih, R. (2004). What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 147–159.
Article Google Scholar
Komodakis, N., & Tziritas, G. (2005). A new framework for approximate labeling via graph cuts. In International conference on computer vision (pp. 1018–1025).
Komodakis, N., Tziritas, G., & Paragios, N. (2007). Fast, approximately optimal solutions for single and dynamic MRFs. In CVPR.
Kumar, M., & Torr, P. (2008). Improved moves for truncated convex models. In Proceedings of advances in neural information processing systems.
Kumar, M., Torr, P., & Zisserman, A. (2005). Obj cut. In IEEE conference on computer vision and pattern recognition (1) (pp. 18–25).
Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labelling sequence data. In International conference on machine learning (pp. 282–289).
Lan, X., Roth, S., Huttenlocher, D., & Black, M. (2006). Efficient belief propagation with learned higher-order Markov random fields. In European conference on computer vision (pp. 269–282).
Lauristzen, S. (1996). Graphical models. Oxford: Oxford University Press.
Google Scholar
Lempitsky, V., Rother, C., & Blake, A. (2007). Logcut—efficient graph cut optimization for Markov random fields. In ICCV.
Levin, A., & Weiss, Y. (2006). Learning to combine bottom-up and top-down segmentation. In European conference on computer vision (pp. 581–594).
Lovasz, L. (1983). Submodular functions and convexity. In Mathematical programming: the state of the art (pp. 235–257).
Orlin, J. (2007). A faster strongly polynomial time algorithm for submodular function minimization. In Proceedings of integer programming and combinatorial optimization (pp. 240–251).
Paget, R., & Longstaff, I. (1998). Texture synthesis via a noncausal nonparametric multiscale Markov random field. IEEE Transactions on Image Processing, 7(6), 925–931.
Article Google Scholar
Potetz, B. (2007). Efficient belief propagation for vision using linear constraint nodes. In IEEE conference on computer vision and pattern recognition.
Rabinovich, A., Belongie, S., Lange, T., & Buhmann, J. (2006). Model order selection and cue combination for image segmentation. In IEEE conference on computer vision and pattern recognition (1) (pp. 1130–1137).
Ren, X., & Malik, J. (2003). Learning a classification model for segmentation. In International conference on computer vision (pp. 10–17).
Roth, S., & Black, M. (2005). Fields of experts: A framework for learning image priors. In IEEE conference on computer vision and pattern recognition (pp. 860–867).
Rother, C., Kolmogorov, V., & Blake, A. (2004). Grabcut: interactive foreground extraction using iterated graph cuts. In ACM transactions on graphics (pp. 309–314).
Russell, B., Freeman, W., Efros, A., Sivic, J., & Zisserman, A. (2006). Using multiple segmentations to discover objects and their extent in image collections. In IEEE conference on computer vision and pattern recognition (2) (pp. 1605–1614).
Schlesinger, D., & Flach, B. (2006). Transforming an arbitrary minsum problem into a binary one (Tech. Rep. TUD-FI06-01). Dresden University of Technology, April 2006.
Sharon, E., Brandt, A., & Basri, R. (2001). Segmentation and boundary detection using multiscale intensity measurements. In IEEE conference on computer vision and pattern recognition (1) (pp. 469–476).
Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905.
Article Google Scholar
Shotton, J., Winn, J., Rother, C., & Criminisi, A. (2006). TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In European conference on computer vision (pp. 1–15).
Tu, Z., & Zhu, S. (2002). Image segmentation by data-driven Markov chain Monte Carlo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 657–673.
Article Google Scholar
Veksler, O. (2007). Graph cut based optimization for MRFs with truncated convex priors. In CVPR.
Wainwright, M., Jaakkola, T., & Willsky, A. (2005). Map estimation via agreement on trees: message-passing and linear programming. IEEE Transactions on Information Theory, 51(11), 3697–3717.
Article MathSciNet Google Scholar
Wang, J., Bhat, P., Colburn, A., Agrawala, M., & Cohen, M. (2005). Interactive video cutout. ACM Transactions on Graphics, 24(3), 585–594.
Article Google Scholar
Yedidia, J., Freeman, W., & Weiss, Y. (2000). Generalized belief propagation. In NIPS (pp. 689–695).

Download references

Author information

Authors and Affiliations

Microsoft Research, Cambridge, UK
Pushmeet Kohli
Oxford Brookes University, Oxford, UK
L’ubor Ladický & Philip H. S. Torr

Authors

Pushmeet Kohli
View author publications
You can also search for this author in PubMed Google Scholar
L’ubor Ladický
View author publications
You can also search for this author in PubMed Google Scholar
Philip H. S. Torr
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pushmeet Kohli.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kohli, P., Ladický, L. & Torr, P.H.S. Robust Higher Order Potentials for Enforcing Label Consistency. Int J Comput Vis 82, 302–324 (2009). https://doi.org/10.1007/s11263-008-0202-0

Download citation

Received: 17 August 2008
Accepted: 16 December 2008
Published: 24 January 2009
Issue Date: May 2009
DOI: https://doi.org/10.1007/s11263-008-0202-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust Higher Order Potentials for Enforcing Label Consistency

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

Image segmentation evaluation: a survey of methods

A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robust Higher Order Potentials for Enforcing Label Consistency

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

Image segmentation evaluation: a survey of methods

A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation