An Object Category Specific mrf for Segmentation

  • M. Pawan Kumar
  • Philip H. S. Torr
  • Andrew Zisserman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4170)


In this chapter we present a principled Bayesian method for detecting and segmenting instances of a particular object category within an image, providing a coherent methodology for combining top down and bottom up cues. The work draws together two powerful formulations: pictorial structures (ps) and Markov random fields (mrfs) both of which have efficient algorithms for their solution. The resulting combination, which we call the object category specific mrf, suggests a solution to the problem that has long dogged mrfs namely that they provide a poor prior for specific shapes. In contrast, our model provides a prior that is global across the image plane using the ps. We develop an efficient method, ObjCut, to obtain segmentations using this model. Novel aspects of this method include an efficient algorithm for sampling the ps model, and the observation that the expected log likelihood of the model can be increased by a single graph cut. Results are presented on two object categories, cows and horses. We compare our methods to the state of the art in object category specific image segmentation and demonstrate significant improvements.


Pairwise Potential Pictorial Structure Part Label Chamfer Distance Tree Cascade 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agarwal, A., Triggs, B.: Tracking articulated motion using a mixture of autoregressive models. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3023, pp. 54–65. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  2. 2.
    Agarwal, S., Roth, D.: Learning a sparse representation for object detection. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 113–127. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
    Blake, A., Rother, C., Brown, M., Perez, P., Torr, P.: Interactive image segmentation using an adaptive GMMRF model. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 428–441. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  4. 4.
    Borenstein, E., Ullman, S.: Class-specific, top-down segmentation. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2351, pp. 109–122. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  5. 5.
    Boykov, Y., Jolly, M.P.: Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In: ICCV, pp.I: 105–112 (2001)Google Scholar
  6. 6.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient matching of pictorial structures. In: CVPR, pp.II: 66–73 (2000)Google Scholar
  7. 7.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Fast algorithms for large state space HMMs with applications to web usage analysis. In: NIPS (2003)Google Scholar
  8. 8.
    Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: CVPR, pp.II: 264–271 (2003)Google Scholar
  9. 9.
    Freedman, D., Zhang, T.: Interactive graph cut based segmentation with shape priors. In: CVPR, pp.I: 755–762 (2005)Google Scholar
  10. 10.
    Gavrila, D.M.: Pedestrian detection from a moving vehicle. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 37–49. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  11. 11.
    Gelman, A., Carlin, J., Stern, H., Rubin, D.: Bayesian Data Analysis. Chapman and Hall, Boca Raton (1995)Google Scholar
  12. 12.
    Goldstein, J., Platt, J., Burges, C.: Indexing high-dimensional rectangles for fast multimedia identification. Technical Report MSR-TR-2003-38, Microsoft Research (2003)Google Scholar
  13. 13.
    Huang, R., Pavlovic, V., Metaxas, D.N.: A graphical model framework for coupling MRFs and deformable models. In: CVPR, pp.II: 739–746 (2004)Google Scholar
  14. 14.
    Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts. IEEE PAMI 26(2), 147–159 (2004)Google Scholar
  15. 15.
    Kumar, M.P., Torr, P.H.S., Zisserman, A.: Extending pictorial structures for object recognition. In: BMVC, pp.II: 789–798 (2004)Google Scholar
  16. 16.
    Kumar, M.P., Torr, P.H.S., Zisserman, A.: Learning layered pictorial structures from video. In: ICVGIP, pp.148–153 (2004)Google Scholar
  17. 17.
    Leibe, B., Schiele, B.: Interleaved object categorization and segmentation. In: BMVC, pp.II: 264–271 (2003)Google Scholar
  18. 18.
    Leung, T., Malik, J.: Recognizing surfaces using three-dimensional textons. In: ICCV, pp. 1010–1017 (1999)Google Scholar
  19. 19.
    Meer, P., Georgescu, B.: Edge detection with embedded confidence. PAMI 23, 1351–1365 (2001)Google Scholar
  20. 20.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1998)Google Scholar
  21. 21.
    Rother, C., Kolmogorov, V., Blake, A.: Grabcut: interactive foreground extraction using iterated graph cuts. In: SIGGRAPH, pp. 309–314 (2004)Google Scholar
  22. 22.
    Thayananthan, A., Stenger, B., Torr, P.H.S., Cipolla, R.: Shape context and chamfer matching in cluttered scenes. In: CVPR, pp.I: 127–133 (2003)Google Scholar
  23. 23.
    Varma, M., Zisserman, A.: Texture classification: Are filter banks necessary? In: CVPR, pp.II: 691–698 (2003)Google Scholar
  24. 24.
    Yedidia, J., Freeman, W., Weiss, Y.: Bethe free energy, Kikuchi approximations, and belief propagation algorithms. Technical Report TR2001-16, MERL (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • M. Pawan Kumar
    • 1
  • Philip H. S. Torr
    • 1
  • Andrew Zisserman
    • 2
  1. 1.Department of ComputingOxford Brookes UniversityOxford
  2. 2.Department of Engineering ScienceUniversity of OxfordOxford

Personalised recommendations