International Journal of Computer Vision

, Volume 103, Issue 2, pp 213–225 | Cite as

Inference Methods for CRFs with Co-occurrence Statistics

  • Ľubor Ladický
  • Chris Russell
  • Pushmeet Kohli
  • Philip H. S. Torr
Article

Abstract

The Markov and Conditional random fields (CRFs) used in computer vision typically model only local interactions between variables, as this is generally thought to be the only case that is computationally tractable. In this paper we consider a class of global potentials defined over all variables in the CRF. We show how they can be readily optimised using standard graph cut algorithms at little extra expense compared to a standard pairwise field. This result can be directly used for the problem of class based image segmentation which has seen increasing recent interest within computer vision. Here the aim is to assign a label to each pixel of a given image from a set of possible object classes. Typically these methods use random fields to model local interactions between pixels or super-pixels. One of the cues that helps recognition is global object co-occurrence statistics, a measure of which classes (such as chair or motorbike) are likely to occur in the same image together. There have been several approaches proposed to exploit this property, but all of them suffer from different limitations and typically carry a high computational cost, preventing their application on large images. We find that the new model we propose produces a significant improvement in the labelling compared to just using a pairwise model and that this improvement increases as the number of labels increases.

Keywords

Conditional random fields Object class segmentation Optimization 

Notes

Acknowledgments

This study was supported by EPSRC research grants, HMGCC, the IST Programme of the European Community, under the PASCAL2 Network of Excellence, IST-2007-216886. P. H. S. Torr is in receipt of Royal Society Wolfson Research Merit Award.

References

  1. Benson, H. Y.,& Shanno, D. F. (2007). An exact primal—dual penalty method approach to warmstarting interior-point methods for linear programming. Computational Optimization and Applications, 38(3), 371–399.Google Scholar
  2. Borenstein, E.,& Malik, J. (2006). Shape guided object segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, (pp. 969–976) New York.Google Scholar
  3. Boykov, Y., Veksler, O.,& Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.Google Scholar
  4. Choi, M. J., Lim, J. J., Torralba, A.,& Willsky, A. S. (2010). Exploiting hierarchical context on a large database of object categories. In IEEE Conference on Computer Vision and Pattern Recognition, San Francisco.Google Scholar
  5. Comaniciu, D.,& Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 603–619.Google Scholar
  6. Csurka, G.,& Perronnin, F. (2008). A simple high performance approach to semantic segmentation. In British Machine Vision Conference08, Leeds.Google Scholar
  7. Delong, A., Osokin, A., Isack, H.,& Boykov, Y. (2010). Fast approximate energy minimization with label costs. In IEEE Conference on Computer Vision and Pattern Recognition, San Francisco.Google Scholar
  8. Felzenszwalb, P. F.,& Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181.Google Scholar
  9. Galleguillos, C., Rabinovich, A.,& Belongie, S. (2008). Object categorization using co-occurrence, location and appearance. In IEEE Conference on Computer Vision and Pattern Recognition, Anchorage.Google Scholar
  10. Gould, S., Fulton, R.,& Koller, D. (2009). Decomposing a scene into geometric and semantically consistent regions. In International Conference on Computer Vision, Kyoto.Google Scholar
  11. Heitz, G.,& Koller, D. (2008). Learning spatial context: Using stuff to find things. In European Conference on Computer Vision, Marseille.Google Scholar
  12. Hoiem, D., Rother, C.,& Winn, J. M. (2007). 3d layoutcrf for multi-view object class recognition and segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, San Diego.Google Scholar
  13. Kleinberg, J.,& Tardos, E. (2002). Approximation algorithms for classification problems with pairwise relationships: Metric labeling and markov random fields. Journal of the ACM, 49(5), 616–639.Google Scholar
  14. Kohli, P., Ladicky, L.,& Torr, P. H. S. (2008). Robust higher order potentials for enforcing label consistency. In IEEE Conference on Computer Vision and Pattern Recognition, Anchorage.Google Scholar
  15. Kolmogorov, V. (2006). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1568–1583.Google Scholar
  16. Kolmogorov, V.,& Rother, C. (2006). Comparison of energy minimization algorithms for highly connected graphs. In Proceedings of European Conference on Computer Vision (pp. 1–15). Heidelberg: Springer.Google Scholar
  17. Komodakis, N., Tziritas, G.,& Paragios, N. (2007). Fast, approximately optimal solutions for single and dynamic mrfs. In IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, MN.Google Scholar
  18. Kumar, M.,& Torr, P. H. S. (2008). Efficiently solving convex relaxations for map estimation. In International Conference on Machine Learning. New York: ACM.Google Scholar
  19. Kumar, M. P., Veksler, O.,& Torr, P. H. S. (2011). Improved moves for truncated convex models. Journal of Machine Learning Research, 12, 31–67.Google Scholar
  20. Ladicky, L., Russell, C., Kohli, P.,& Torr, P. H. S. (2009). Associative hierarchical crfs for object class image segmentation. In International Conference on Computer Vision.Google Scholar
  21. Ladicky, L., Russell, C., Sturgess, P., Alahari, K.,& Torr, P. H. S. (2010). What, where and how many? combining object detectors and crfs. European Conference on Computer Vision.Google Scholar
  22. Lafferty, J., McCallum, A.,& Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labelling sequence data. In International Conference on Machine Learning.Google Scholar
  23. Larlus, D.,& Jurie, F. (2008). Combining appearance models and markov random fields for category level object segmentation. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  24. Narasimhan, M.,& Bilmes, J. A. (2005). A submodular-supermodular procedure with applications to discriminative structure learning. In Uncertainty in Artificial Intelligence (pp. 404–412). Google Scholar
  25. Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E.,& Belongie, S. (2007). Objects in context. In International Conference on Computer Vision, Rio de Janeiro.Google Scholar
  26. Ren, X., Fowlkes, C.,& Malik, J. (2005). Mid-level cues improve boundary detection. Tech. Rep. UCB/CSD-05-1382, EECS Department, University of California, Berkeley.Google Scholar
  27. Rother, C., Kumar, S., Kolmogorov, V.,& Blake, A. (2005). Digital tapestry. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 589–596).Google Scholar
  28. Russell, B., Freeman, W., Efros, A., Sivic, J.,& Zisserman, A. (2006). Using multiple segmentations to discover objects and their extent in image collections. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  29. Russell, C., Ladicky, L., Kohli, P.,& Torr, P. H. S. (2010). Exact and approximate inference in associative hierarchical networks using graph cuts. Uncertainty in Artificial Intelligence, Catalina Island, CA.Google Scholar
  30. Schlesinger, M. (1976). Syntactic analysis of two-dimensional visual signals in noisy conditions. Kibernetika, 4, 113–130. (in Russian).Google Scholar
  31. Schölkopf, B.,& Smola, A. J. (2001). Learning with kernels: support vector machines, regularization, optimization, and beyond. Adoptive Computation& Machine Learning. Cambridge, MA: MIT Press.Google Scholar
  32. Shi, J.,& Malik, J. (2000). Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell, 22(8), 888–905.Google Scholar
  33. Shotton, J., Winn, J., Rother, C.,& Criminisi, A. (2006). TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In European Conference on Computer Vision (Vol. 1, pp 1–15).Google Scholar
  34. Sturgess, P., Ladicky, L., Crook, N.,& Torr, P. H. S. (2012). Scalable cascade inference for semantic image segmentation. In British Machine Vision Conference.Google Scholar
  35. Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., et al. (2006). A comparative study of energy minimization methods for markov random fields. In European Conference on Computer Vision.Google Scholar
  36. Torr, P. H. S. (1998). Geometric motion segmentation and model selection [and discussion]. Philosophical Transactions: Mathematical, Physical and Engineering Sciences, 356(1740), 1321–1340.Google Scholar
  37. Torralba, A., Murphy, K. P., Freeman, W. T.,& Rubin, M. A. (2003). Context-based vision system for place and object recognition. In Proceedings of the Nineth IEEE International Conference on Computer Vision.Google Scholar
  38. Toyoda, T.,& Hasegawa, O. (2008). Random field model for integration of local information and global information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(8), 1483–1489.Google Scholar
  39. Wainwright, M., Jaakkola, T.,& Willsky, A. (2002). Map estimation via agreement on (hyper)trees: Messagepassing and linear programming approaches. Cambridge, MA: MIT Press.Google Scholar
  40. Wainwright, M., Jaakkola, T.,& Willsky, A. (2005). Map estimation via agreement on trees: Message-passing and linear programming. IEEE Transactions on Information Theory (pp. 3697–3717).Google Scholar
  41. Weiss, Y.,& Freeman, W. (2001). On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs. IEEE Transactions on Information Theory, 47(2), 723–735.Google Scholar
  42. Werner, T. (2007). A linear programming approach to max-sum problem: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(7), 1165–1179.Google Scholar
  43. Yang, L., Meer, P.,& Foran, D. J. (2007). Multiple class segmentation using a unified framework over mean-shift patches. In IEEE Conference on Computer Vision and Pattern Recognition. Google Scholar

Copyright information

© Springer Science+Business Media New York 2012

Authors and Affiliations

  • Ľubor Ladický
    • 1
  • Chris Russell
    • 2
  • Pushmeet Kohli
    • 3
  • Philip H. S. Torr
    • 4
  1. 1.University of OxfordOxfordUK
  2. 2.Queen Mary College, University of LondonLondonUK
  3. 3.Microsoft ResearchCambridgeUK
  4. 4.Oxford Brookes UniversityOxfordUK

Personalised recommendations