Computer Vision pp 51-108

Part of the Studies in Computational Intelligence book series (SCI, volume 285) | Cite as

Dynamic Graph Cuts and Their Applications in Computer Vision

  • Pushmeet Kohli
  • Philip H. S. Torr

Abstract

Over the last few years energy minimization has emerged as an indispensable tool in computer vision. The primary reason for this rising popularity has been the successes of efficient graph cut based minimization algorithms in solving many low level vision problems such as image segmentation, object reconstruction, image restoration and disparity estimation. The scale and form of computer vision problems introduce many challenges in energy minimization. In this chapter we address the problem of efficient and exact minimization of groups of similar functions which are known to be solvable in polynomial time. We will present a novel dynamic algorithm for minimizing such functions. This algorithm reuses computation from previous problem instances to solve new instances resulting in a substantial improvement in the running time. We will present the results of using this approach on the problems of interactive image segmentation, image segmentation in video, human pose estimation and segmentation, and measuring uncertainty of solutions obtained by minimizing energy functions.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agarwal, A., Triggs, B.: 3d human pose from silhouettes by relevance vector regression. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 882–888 (2004)Google Scholar
  2. 2.
    Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network Flows. Prentice Hall, Eaglewood Cliffs (1993)Google Scholar
  3. 3.
    Alahari, K., Kohli, P., Torr, P.H.S.: Reduce, reuse & recycle: Efficiently solving multi-label mrfs. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (2008)Google Scholar
  4. 4.
    Bhatia, S., Sigal, L., Isard, M., Black, M.J.: 3d human limb detection using space carving and multi-view eigen models. In: Proceedings of the ANM Workshop (2004)Google Scholar
  5. 5.
    Blake, A., Rother, C., Brown, M., Perez, P., Torr, P.H.S.: Interactive image segmentation using an adaptive GMMRF model. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 428–441. Springer, Heidelberg (2004)Google Scholar
  6. 6.
    Boros, E., Hammer, P.L.: Pseudo-boolean optimization. Discrete Applied Mathematics 123(1-3), 155–225 (2002)MATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Boykov, Y., Jolly, M.P.: Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In: Proceedings of the International Conference on Computer Vision, pp. 105–112 (2001)Google Scholar
  8. 8.
    Boykov, Y., Kolmogorov, V.: An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision. IEEE Transaction on Pattern Analysis and Machine Intelligence 26(9), 1124–1137 (2004)CrossRefGoogle Scholar
  9. 9.
    Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Transaction on Pattern Analysis and Machine Intelligence 23(11), 1222–1239 (2001)CrossRefGoogle Scholar
  10. 10.
    Bray, M., Kohli, P., Torr, P.H.S.: Posecut: Simultaneous segmentation and 3d pose estimation of humans using dynamic graph-cuts. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 642–655. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  11. 11.
    Chiang, Y., Tamassia, R.: Dynamic algorithms in computational geometry. IEEE Special Issue on Computational Geometry 80, 362–381 (1992)Google Scholar
  12. 12.
    Cohen, R.F., Tamassia, R.: Dynamic expression trees and their applications. In: Proceedings of the Symposium on Discrete Algorithms, pp. 52–61 (1991)Google Scholar
  13. 13.
    Cremers, D., Osher, S., Soatto, S.: Kernel density estimation and intrinsic alignment for shape priors in level set segmentation. International Journal of Computer Vision 69(3), 335–351 (2006)CrossRefGoogle Scholar
  14. 14.
    Dawid, P.: Applications of a general propagation algorithm for probabilistic expert systems. Statistics and Computing 2, 25–36 (1992)CrossRefGoogle Scholar
  15. 15.
    Deutscher, J., Davison, A., Reid, I.: Automatic partitioning of high dimensional search spaces associated with articulated body motion capture. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 669–676 (2001)Google Scholar
  16. 16.
    Dinic, E.A.: Algorithm for solution of a problem of maximum flow in networks with power estimation. Soviet Math. Dokl. 11, 1277–1280 (1970)Google Scholar
  17. 17.
    Felzenszwalb, P.F., Huttenlocher, D.: Distance transforms of sampled functions. Technical Report TR2004-1963, Cornell University (2004)Google Scholar
  18. 18.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient matching of pictorial structures. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 2066–2073 (2000)Google Scholar
  19. 19.
    Flach, B.: Strukturelle bilderkennung. Technical report, Universit at Dresden (2002)Google Scholar
  20. 20.
    Ford, L.R., Fulkerson, D.R.: Flows in Networks. Princeton University Press, Princeton (1962)MATHGoogle Scholar
  21. 21.
    Freedman, D., Zhang, T.: Interactive graph cut based segmentation with shape priors. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 755–762 (2005)Google Scholar
  22. 22.
    Gallo, G., Grigoriadis, M.D., Tarjan, R.E.: A fast parametric maximum flow algorithm and applications. SIAM Journal on Computing 18:18, 30–55 (1989)MATHCrossRefMathSciNetGoogle Scholar
  23. 23.
    Gavrila, D.M., Davis, L.S.: 3D model-based tracking of humans in action: a multi-view approach. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 73–80 (1996)Google Scholar
  24. 24.
    Greig, D., Porteous, B., Seheult, A.: Exact maximum a posteriori estimation for binary images. RoyalStat 51(2), 271–279 (1989)Google Scholar
  25. 25.
    Hengel, A., Dick, A., Thormhlen, T., Ward, B., Torr, P.H.S.: Rapid interactive modelling from video with graph cuts. In: Proceedings of Eurographics (2006)Google Scholar
  26. 26.
    Huang, R., Pavlovic, V., Metaxas, D.N.: A graphical model framework for coupling MRFs and deformable models. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 739–746 (2004)Google Scholar
  27. 27.
    Ishikawa, H.: Exact optimization for markov random fields with convex priors. IEEE Transaction on Pattern Analysis and Machine Intelligence 25, 1333–1336 (2003)CrossRefGoogle Scholar
  28. 28.
    Ishikawa, H., Geiger, D.: Occlusions, discontinuities, and epipolar lines in stereo. In: Burkhardt, H.-J., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1406, pp. 232–248. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  29. 29.
    Ishikawa, H., Geiger, D.: Segmentation by grouping junctions. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 125–131 (1998)Google Scholar
  30. 30.
    Juan, O., Boykov, Y.: Active graph cuts. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 1023–1029 (2006)Google Scholar
  31. 31.
    Kehl, R., Bray, M., Van Gool, L.: Full body tracking from multiple views using stochastic sampling. In: Proceedings of the International onference on Computer Vision and Pattern Recognition, pp. 129–136 (2005)Google Scholar
  32. 32.
    Kohli, P., Kumar, M.P., Torr, P.H.S.: P 3 and beyond: Solving energies with higher order cliques. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  33. 33.
    Kohli, P., Torr, P.H.S.: Efficiently solving dynamic markov random fields using graph cuts. In: Proceedings of the International Conference on Computer Vision, pp. 922–929 (2005)Google Scholar
  34. 34.
    Kohli, P., Torr, P.H.S.: Measuring uncertainty in graph cut solutions: Efficiently computing min-marginal energies using dynamic graph cuts. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 30–43. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  35. 35.
    Kolmogorov, V.: Convergent tree-reweighted message passing for energy minimization. IEEE Transaction on Pattern Analysis and Machine Intelligence 28(10), 1568–1583 (2006)CrossRefGoogle Scholar
  36. 36.
    Kolmogorov, V., Criminisi, A., Blake, A., Cross, G., Rother, C.: Bi-layer segmentation of binocular stereo video. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 407–414 (2005)Google Scholar
  37. 37.
    Kolmogorov, V., Zabih, R.: Multi-camera scene reconstruction via graph cuts. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2352, pp. 82–96. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  38. 38.
    Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts? IEEE Transaction on Pattern Analysis and Machine Intelligence 26(2), 147–159 (2004)CrossRefGoogle Scholar
  39. 39.
    Komodakis, N.: A new framework for approximate labeling via graph cuts. In: Proceedings of the International Conference on Computer Vision, pp. 1018–1025 (2005)Google Scholar
  40. 40.
    Kumar, M.P., Torr, P.H.S., Zisserman, A.: Obj cut. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 18–25 (2005)Google Scholar
  41. 41.
    Kutulakos, K.N., Seitz, M.: A theory of shape by space carving. International Journal of Computer Vision 38(3) (2000)Google Scholar
  42. 42.
    Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labelling sequence data. In: Proceedings of the International Conference on Machine Learning, pp. 282–289 (2001)Google Scholar
  43. 43.
    Lan, X., Huttenlocher, D.P.: Beyond trees: Common-factor models for 2d human pose recovery. In: Proceedings of the International Conference on Computer Vision, pp. 470–477 (2005)Google Scholar
  44. 44.
    Lempitsky, V.S., Roth, S., Rother, C.: Fusionflow: Discrete-continuous optimization for optical flow estimation. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (2008)Google Scholar
  45. 45.
    Leventon, M.E., Grimson, W.E.L., Faugeras, O.D.: Statistical shape influence in geodesic active contours. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 1316–1323 (2000)Google Scholar
  46. 46.
    Mori, G., Ren, X., Efros, A.A., Malik, J.: Recovering human body configurations: Combining segmentation and recognition. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 326–333 (2004)Google Scholar
  47. 47.
    Nilsson, D.: An efficient algorithm for finding the m most probable configurations in bayesian networks. Statistics and Computing 8(2), 159–173 (1998)CrossRefGoogle Scholar
  48. 48.
    Pearl, J.: Fusion, propagation, and structuring in belief networks. Artificial Intelligence 29(3), 241–288 (1986)MATHCrossRefMathSciNetGoogle Scholar
  49. 49.
    Press, W., Flannery, B., Teukolsky, S., Vetterling, W.: Numerical recipes in C. Cambridge Uni. Press, Cambridge (1988)MATHGoogle Scholar
  50. 50.
    Ramanan, D.: Using segmentation to verify object hypotheses. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  51. 51.
    Ramanan, D., Forsyth, D.A.: Finding and tracking people from the bottom up. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 467–474 (2003)Google Scholar
  52. 52.
    Rihan, J., Kohli, P., Torr, P.H.S.: Objcut for face detection. In: Kalra, P.K., Peleg, S. (eds.) ICVGIP 2006. LNCS, vol. 4338, pp. 576–584. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  53. 53.
    Roth, S., Black, M.J.: Fields of experts: A framework for learning image priors. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 860–867 (2005)Google Scholar
  54. 54.
    Rother, C., Kolmogorov, V., Lempitsky, V., Szummer, M.: Optimizing binary MRFs via extended roof duality. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  55. 55.
    Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision 47(1-3), 7–42 (2002)MATHCrossRefGoogle Scholar
  56. 56.
    Schlesinger, D., Flach, B.: Transforming an arbitrary minsum problem into a binary one. Technical Report TUD-FI06-01, Dresden University of Technology (2006)Google Scholar
  57. 57.
    Shakhnarovich, G., Viola, P., Darrell, T.J.: Fast pose estimation with parameter-sensitive hashing. In: Proceedings of the International Conference on Computer Vision, pp. 750–757 (2003)Google Scholar
  58. 58.
    Sidenbladh, H., Black, M.J., Fleet, D.J.: Stochastic tracking of 3D human figures using 2D image motion. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 702–718. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  59. 59.
    Sminchisescu, C., Jepson, A.D.: Generative modeling for continuous non-linearly embedded visual inference. In: Proceedings of the International Conference on Machine Learning (2004)Google Scholar
  60. 60.
    Sminchisescu, C., Triggs, B.: Covariance scaled sampling for monocular 3D body tracking. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 447–454 (2001)Google Scholar
  61. 61.
    Snow, D., Viola, P., Zabih, R.: Exact voxel occupancy with graph cuts. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 345–352 (2000)Google Scholar
  62. 62.
    Stauffer, C., Grimson, W.E.L.: Adaptive background mixture models for real-time tracking. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 246–252 (1999)Google Scholar
  63. 63.
    Stenger, B., Thayananthan, A., Torr, P.H.S., Cipolla, R.: Filtering using a tree-based estimator. In: Proceedings of the International Conference on Computer Vision, pp. 1063–1070 (2003)Google Scholar
  64. 64.
    Sun, Y., Kohli, P., Bray, M., Torr, P.H.S.: Using strong shape priors for stereo. In: Kalra, P.K., Peleg, S. (eds.) ICVGIP 2006. LNCS, vol. 4338, pp. 882–893. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  65. 65.
    Szeliski, R.: Rapid octree construction from image sequences. Computer Vision Graphics and Image Processing 58, 23–32 (1993)Google Scholar
  66. 66.
    Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., Tappen, M.F., Rother, C.: A comparative study of energy minimization methods for markov random fields. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 16–29. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  67. 67.
    Thorup, M.: Fully-dynamic min-cut. In: Proceedings of the ACM Symposium on Theory of Computing, pp. 224–230 (2001)Google Scholar
  68. 68.
    Urtasun, R., Fleet, D.J., Hertzmann, A., Fua, P.: Priors for people tracking from small training sets. In: Proceedings of the International Conference on Computer Vision, pp. 403–410 (2005)Google Scholar
  69. 69.
    Veksler, O.: Graph cut based optimization for MRFs with truncated convex priors. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  70. 70.
    Viola, P.A., Jones, M.J.: Robust real-time face detection. International Journal of Computer Vision 57(2), 137–154 (2004)CrossRefGoogle Scholar
  71. 71.
    Vogiatzis, G., Torr, P.H.S., Cipolla, R.: Multi-view stereo via volumetric graph-cuts. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 391–398 (2005)Google Scholar
  72. 72.
    Wainwright, M.J., Jaakkola, T., Willsky, A.S.: Map estimation via agreement on trees: message-passing and linear programming. IEEE Transactions on Information Theory 51(11), 3697–3717 (2005)CrossRefMathSciNetGoogle Scholar
  73. 73.
    Wang, J., Bhat, P., Colburn, A., Agrawala, M., Cohen, M.F.: Interactive video cutout. ACM Transaction on Graphics 24(3), 585–594 (2005)CrossRefGoogle Scholar
  74. 74.
    Woodford, O.J., Torr, P.H.S., Reid, I.D., Fitzgibbon, A.W.: Global stereo reconstruction under second order smoothness priors. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (2008)Google Scholar
  75. 75.
    Xiao, J., Shah, M.: Motion layer extraction in the presence of occlusion using graph cut. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 972–979 (2004)Google Scholar
  76. 76.
    Yanover, C., Weiss, Y.: Finding the m most probable configurations in arbitrary graphical models. In: Proceedings of the International Conference on Neural Information Processing Systems (2004)Google Scholar
  77. 77.
    Yedidia, J.S., Freeman, W.T., Weiss, Y.: Generalized belief propagation. In: Proceedings of the International Conference on Neural Information Processing Systems, pp. 689–695 (2000)Google Scholar
  78. 78.
    Zhao L., Davis, L. S.: Closely coupled object detection and segmentation. In: Proceedings of the International Conference on Computer Vision, pp. 454–461 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Pushmeet Kohli
    • 1
  • Philip H. S. Torr
    • 2
  1. 1.Microsoft ResearchCambridgeUK
  2. 2.Oxford Brookes UniversityOxfordUK

Personalised recommendations