A Deep Variational Model for Image Segmentation

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8753)


In this paper we introduce a novel model that combines Deep Convolutional Neural Networks with a global inference model. Our model is derived from a convex variational relaxation of the minimum s-t cut problem on graphs, which is frequently used for the task of image segmentation. We treat the outputs of Convolutional Neural Networks as the unary and pairwise potentials of a graph and derive a smooth approximation to the minimum s-t cut problem. During training, this approximation facilitates the adaptation of the Convolutional Neural Network to the smoothing that is induced by the global model. The training algorithm can be understood as a modified backpropagation algorithm, that explicitly takes the global inference layer into account.

We illustrate our approach on the task of supervised figure-ground segmentation. In contrast to competing approaches we train directly on the raw pixels of the input images and do not rely on hand-crafted features. Despite its generality, simplicity and complete lack of hand-crafted features, our approach is able to yield competitive performance on the Graz02 and Weizmann Horses datasets.


Image Segmentation Convolutional Neural Network Gaussian Random Field Unary Potential Pairwise Potential 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Aldavert, D., Ramisa, A., de Mantaras, R.L., Toledo, R.: Fast and robust object segmentation with the integral linear classifier. In: CVPR (2010)Google Scholar
  2. 2.
    Alvarez, J.M., LeCun, Y., Gevers, T., Lopez, A.: Semantic road segmentation via multi-scale ensembles of learned features. In: ECCV Workshops (2012)Google Scholar
  3. 3.
    Bertelli, L., Yu, T., Vu, D., Gokturk, B.: Kernelized structural svm learning for supervised object segmentation. In: CVPR (2011)Google Scholar
  4. 4.
    Borenstein, E., Sharon, E., Ullman, S.: Combining top-down and bottom-up segmentation. In: CVPR (2004)Google Scholar
  5. 5.
    Bottou, L., Le Cun, Y., Bengio, Y.: Global training of document processing systems using graph transformer networks. In: Proceedings of Computer Vision and Pattern Recognition, pp. 489–493. IEEE, Puerto-Rico (1997)Google Scholar
  6. 6.
    Boykov, Y.Y., Jolly, M.P.: Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images. In: ICCV (2001)Google Scholar
  7. 7.
    Brakel, P., Stroobandt, D., Schrauwen, B.: Training energy-based models for time-series imputation. J. of Mach. Learn. Res. 14, 2771–2797 (2013)MathSciNetGoogle Scholar
  8. 8.
    Chambolle, A., Darbon, J.: On total variation minimization and surface evolution using parametric maximum flows. IJCV 84(3), 288–307 (2009)CrossRefGoogle Scholar
  9. 9.
    Chan, T.F., Esedoglu, S., Nikolova, M.: Algorithms for finding global minimizers of image segmentation and denoising models. J. App. Math. 66, 1632–1648 (2004)MathSciNetGoogle Scholar
  10. 10.
    Cour, T., Gogin, N., Shi, J.: Learning spectral graph segmentation. In: AISTATS (2005)Google Scholar
  11. 11.
    Domke, J.: Generic methods for optimization-based modeling. J. Mach. Learn. Res. 22, 318–326 (2012)Google Scholar
  12. 12.
    Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Scene parsing with multiscale feature learning, purity trees, and optimal covers. In: ICML (2012)Google Scholar
  13. 13.
    Fulkerson, B., Vedaldi, A., Soatto, S.: Class segmentation and object localization with superpixel neighborhoods. In: ICCV (2009)Google Scholar
  14. 14.
    Hinton, G.: Training products of experts by minimizing contrastive divergence. Neur. Comput. 14, 1771–1800 (2000)CrossRefGoogle Scholar
  15. 15.
    Hinton, G., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neur. Comput. 18, 1527–1554 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Jain, V., Seung, H.S.: Natural image denoising with convolutional networks. In: NIPS (2008)Google Scholar
  17. 17.
    Jancsary, J., Nowozin, S., Sharp, T., Rother, C.: Regression tree fields - an efficient, non-parametric approach to image labeling problems. In: CVPR (2012)Google Scholar
  18. 18.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)Google Scholar
  19. 19.
    Kuettel, D., Ferrari, V.: Figure-ground segmentation by transferring window masks. In: CVPR (2012)Google Scholar
  20. 20.
    Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp. 2278–2324 (1998)Google Scholar
  21. 21.
    Lempitsky, V.S., Vedaldi, A., Zisserman, A.: Pylon model for semantic segmentation. In: NIPS (2011)Google Scholar
  22. 22.
    Levin, A., Weiss, Y.: Learning to combine bottom-up and top-down segmentation. IJCV 81(1), 105–118 (2009)CrossRefGoogle Scholar
  23. 23.
    Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45, 503–528 (1989)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Marszalek, M., Schmid, C.: Accurate object localization with shape masks. In: CVPR (2007)Google Scholar
  25. 25.
    Nesterov, Y.: Gradient methods for minimizing composite objective function. Math. Program. 140, 125–161 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Nowozin, S., Rother, C., Bagon, S., Sharp, T., Yao, B., Kohli, P.: Decision tree fields. In: ICCV (2011)Google Scholar
  27. 27.
    Opelt, A., Pinz, A., Fussenegger, M., Auer, P.: Generic object recognition with boosting. PAMI 28, 416–431 (2004)CrossRefGoogle Scholar
  28. 28.
    Pock, T., Chambolle, A., Cremers, D., Bischof, H.: A convex relaxation approach for computing minimal partitions. In: CVPR (2009)Google Scholar
  29. 29.
    Samuel, K.G.G., Tappen, M.F.: Learning optimized map estimates in continuously-valued mrf models. In: CVPR (2009)Google Scholar
  30. 30.
    Sermanet, P., LeCun, Y.: Traffic sign recognition with multi-scale convolutional networks. In: IJCNN, pp. 2809–2813 (2011)Google Scholar
  31. 31.
    Tappen, M.F., Samuel, K.G.G., Dean, C.V., Lyle, D.M.: The logistic random field - a convenient graphical model for learning parameters for mrf-based labeling. In: CVPR (2008)Google Scholar
  32. 32.
    Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res. 6, 1453–1484 (2005)MathSciNetzbMATHGoogle Scholar
  33. 33.
    Turaga, S.C., Murray, J.F., Jain, V., Roth, F., Helmstaedter, M., Briggman, K.L., Denk, W., Seung, H.S.: Convolutional networks can learn to generate affinity graphs for image segmentation. Neural Comput. 22(2), 511–538 (2010)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Institute for Computer Graphics and VisionGraz University of TechnologyGrazAustria
  2. 2.Safety and Security DepartmentAIT Austrian Institute of TechnologySeibersdorfAustria

Personalised recommendations