Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation

  • Alexander KolesnikovEmail author
  • Christoph H. Lampert
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9908)


We introduce a new loss function for the weakly-supervised training of semantic image segmentation models based on three guiding principles: to seed with weak localization cues, to expand objects based on the information about which classes can occur in an image, and to constrain the segmentations to coincide with object boundaries. We show experimentally that training a deep convolutional neural network using the proposed loss function leads to substantially better segmentations than previous state-of-the-art methods on the challenging PASCAL VOC 2012 dataset. We furthermore give insight into the working mechanism of our method by a detailed experimental study that illustrates how the segmentation quality is affected by each term of the proposed loss function as well as their combinations.


Weakly-supervised image segmentation Deep learning 



This work was funded by the European Research Council under the European Unions Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement no 308036. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the GPUs used for this research. We also thank Vittorio Ferrari for helpful feedback.

Supplementary material

419976_1_En_42_MOESM1_ESM.pdf (1.1 mb)
Supplementary material 1 (pdf 1168 KB)


  1. 1.
    Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: NIPS (2002)Google Scholar
  2. 2.
    Arbeláez, P., Pont-Tuset, J., Barron, J., Marques, F., Malik, J.: Multiscale combinatorial grouping. In: CVPR (2014)Google Scholar
  3. 3.
    Bazzani, L., Bergamo, A., Anguelov, D., Torresani, L.: Self-taught object localization with deep networks. In: WACV (2016)Google Scholar
  4. 4.
    Bearman, A., Russakovsky, O., Ferrari, V., Fei-Fei, L.: What’s the point: Semantic segmentation with point supervision. In: ECCV (2016)Google Scholar
  5. 5.
    Carreira, J., Sminchisescu, C.: CPMC: Automatic object segmentation using constrained parametric min-cuts. IEEE T-PAMI 34(7), 312–1328 (2012)CrossRefGoogle Scholar
  6. 6.
    Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: ICLR (2015)Google Scholar
  7. 7.
    Cheng, M.M., Zhang, Z., Lin, W.Y., Torr, P.H.S.: BING: Binarized normed gradients for objectness estimation at 300fps. In: CVPR (2014)Google Scholar
  8. 8.
    Dai, J., He, K., Sun, J.: BoxSup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: ICCV (2015)Google Scholar
  9. 9.
    Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The PASCAL visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)CrossRefGoogle Scholar
  10. 10.
    Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: ICCV (2011)Google Scholar
  11. 11.
    He, X., Zemel, R.S.: Learning hybrid models for image annotation with partially labeled data. In: NIPS (2009)Google Scholar
  12. 12.
    Hong, S., Oh, J., Lee, H., Han, B.: Learning transferrable knowledge for semantic segmentation with deep convolutional neural network. In: CVPR (2016)Google Scholar
  13. 13.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093v1 (2014)
  14. 14.
    Kim, H., Hwang, S.: Scale-invariant feature learning using deconvolutional neural networks for weakly-supervised semantic segmentation. arXiv preprint arXiv:1602.04984v2 (2016).
  15. 15.
    Kolesnikov, A., Lampert, C.H.: Improving weakly-supervised object localization by micro-annotation. In: BMVC (2016)Google Scholar
  16. 16.
    Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: Three principles for weakly-supervised image segmentation. arXiv preprint arXiv:1603.06098 (2016).
  17. 17.
    Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with gaussian edge potentials. In: NIPS (2011)Google Scholar
  18. 18.
    Krapac, J., Šegvic, S.: Weakly-supervised semantic segmentation by redistributing region scores to pixels. In: GCPR (2016)Google Scholar
  19. 19.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)Google Scholar
  20. 20.
    Liu, S., Yan, S., Zhang, T., Xu, C., Liu, J., Lu, H.: Weakly supervised graph propagation towards collective image parsing. IEEE T-MM 14(2), 361–373 (2012)Google Scholar
  21. 21.
    Nowozin, S., Gehler, P.V., Lampert, C.H.: On parameter learning in CRF-based approaches to object class image segmentation. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 98–111. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  22. 22.
    Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Is object localization for free? - weakly-supervised learning with convolutional neural networks. In: CVPR, pp. 685–694 (2015)Google Scholar
  23. 23.
    Papandreou, G., Chen, L.C., Murphy, K.P., Yuille, A.L.: Weakly- and semi-supervised learning of a deep convolutional network for semantic image segmentation. In: ICCV (2015)Google Scholar
  24. 24.
    Pathak, D., Krähenbühl, P., Darrell, T.: Constrained convolutional neural networks for weakly supervised segmentation. In: ICCV (2015)Google Scholar
  25. 25.
    Pathak, D., Shelhamer, E., Long, J., Darrell, T.: Fully convolutional multi-class multiple instance learning. In: ICLR (2015)Google Scholar
  26. 26.
    Pinheiro, P.O., Collobert, R.: From image-level to pixel-level labeling with convolutional networks. In: CVPR (2015)Google Scholar
  27. 27.
    Pourian, N., Karthikeyan, S., Manjunath, B.: Weakly supervised graph based semantic segmentation by learning communities of image-parts. In: CVPR (2015)Google Scholar
  28. 28.
    Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., Belongie, S.: Objects in context. In: ICCV (2007)Google Scholar
  29. 29.
    Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Scudder, H.J.: Probability of error of some adaptive pattern-recognition machines. IEEE T-IT 11(3), 363–371 (1965)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: ECCV (2006)Google Scholar
  32. 32.
    Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. In: ICLR (2014)Google Scholar
  33. 33.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)Google Scholar
  34. 34.
    Toyoda, T., Hasegawa, O.: Random field model for integration of local information and global information. IEEE T-PAMI 30(8), 1483–1489 (2008)CrossRefGoogle Scholar
  35. 35.
    Vasconcelos, M., Vasconcelos, N., Carneiro, G.: Weakly supervised top-down image segmentation. In: CVPR (2006)Google Scholar
  36. 36.
    Verbeek, J., Triggs, B.: Region classification with Markov field aspect models. In: CVPR (2007)Google Scholar
  37. 37.
    Verbeek, J., Triggs, W.: Scene segmentation with CRFs learned from partially labeled images. In: NIPS (2008)Google Scholar
  38. 38.
    Vezhnevets, A., Buhmann, J.M.: Towards weakly supervised semantic segmentation by means of multiple instance and multitask learning. In: CVPR (2010)Google Scholar
  39. 39.
    Vezhnevets, A., Ferrari, V., Buhmann, J.M.: Weakly supervised semantic segmentation with a multi-image model. In: ICCV (2011)Google Scholar
  40. 40.
    Vezhnevets, A., Ferrari, V., Buhmann, J.M.: Weakly supervised structured output learning for semantic segmentation. In: CVPR (2012)Google Scholar
  41. 41.
    Wei, Y., Liang, X., Chen, Y., Jie, Z., Xiao, Y., Zhao, Y., Yan, S.: Learning to segment with image-level annotations. Pattern Recognition (2016)Google Scholar
  42. 42.
    Wei, Y., Liang, X., Chen, Y., Shen, X., Cheng, M., Zhao, Y., Yan, S.: STC: a simple to complex framework for weakly-supervised semantic segmentation. arXiv preprint arXiv:1509.03150v1 (2015).
  43. 43.
    Xie, W., Peng, Y., Xiao, J.: Weakly-supervised image parsing via constructing semantic graphs and hypergraphs. In: Multimedia (2014)Google Scholar
  44. 44.
    Xu, J., Schwing, A.G., Urtasun, R.: Tell me what you see and I will show you where it is. In: CVPR (2014)Google Scholar
  45. 45.
    Xu, J., Schwing, A.G., Urtasun, R.: Learning to segment under various forms of weak supervision. In: CVPR (2015)Google Scholar
  46. 46.
    Zhang, L., Gao, Y., Xia, Y., Lu, K., Shen, J., Ji, R.: Representative discovery of structure cues for weakly-supervised image segmentation. IEEE T-MM 16(2), 470–479 (2014)Google Scholar
  47. 47.
    Zhang, L., Song, M., Liu, Z., Liu, X., Bu, J., Chen, C.: Probabilistic graphlet cut: Exploiting spatial structure cue for weakly supervised image segmentation. In: CVPR (2013)Google Scholar
  48. 48.
    Zhang, L., Yang, Y., Gao, Y., Yu, Y., Wang, C., Li, X.: A probabilistic associative model for segmenting weakly supervised images. IEEE T-IP 23(9), 4150–4159 (2014)MathSciNetCrossRefGoogle Scholar
  49. 49.
    Zhang, W., Zeng, S., Wang, D., Xue, X.: Weakly supervised semantic segmentation for social images. In: CVPR (2015)Google Scholar
  50. 50.
    Zhou, B., Khosla, A., A., L., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR (2016)Google Scholar
  51. 51.
    Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Object detectors emerge in deep scene CNNs. In: ICLR (2015)Google Scholar
  52. 52.
    Zhu, J., Mao, J., Yuille, A.L.: Learning from weakly supervised data by the expectation loss SVM (e-SVM) algorithm. In: NIPS (2014)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.IST AustriaKlosterneuburgAustria

Personalised recommendations