International Journal of Computer Vision

, Volume 117, Issue 3, pp 199–225

Midrange Geometric Interactions for Semantic Segmentation

Constraints for Continuous Multi-label Optimization
  • Julia Diebold
  • Claudia Nieuwenhuis
  • Daniel Cremers
Article
  • 989 Downloads

Abstract

In this article we introduce the concept of midrange geometric constraints into semantic segmentation. We call these constraints ‘midrange’ since they are neither global constraints, which take into account all pixels without any spatial limitation, nor are they local constraints, which only regard single pixels or pairwise relations. Instead, the proposed constraints allow to discourage the occurrence of labels in the vicinity of each other, e.g., ‘wolf’ and ‘sheep’. ‘Vicinity’ encompasses spatial distance as well as specific spatial directions simultaneously, e.g., ‘plates’ are found directly above ‘tables’, but do not fly over them. It is up to the user to specifically define the spatial extent of the constraint between each two labels. Such constraints are not only interesting for scene segmentation, but also for part-based articulated or rigid objects. The reason is that object parts such as for example arms, torso and legs usually obey specific spatial rules, which are among the few things that remain valid for articulated objects over many images and which can be expressed in terms of the proposed midrange constraints, i.e. closeness and/or direction. We show, how midrange geometric constraints are formulated within a continuous multi-label optimization framework, and we give a convex relaxation, which allows us to find globally optimal solutions of the relaxed problem independent of the initialization.

Keywords

Variational Image segmentation Convex optimization Directional relations Geometric relations  Midlevel range interactions 

References

  1. Arbelaez, P., Hariharan, B., Gu, C., Gupta, S., Bourdev, L., & Malik, J. (2012). Semantic segmentation using regions and parts. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  2. Batra, D., Kowdle, A., Parikh, D., Luo, J., & Chen, T. (2010). iCoseg: Interactive co-segmentation with intelligent scribble guidance. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  3. Bergbauer, J., Nieuwenhuis, C., Souiai, M., & Cremers, D. (2013). Proximity priors for variational semantic segmentation and recognition. In ICCV Workshop on Graphical Models for Scene Understanding.Google Scholar
  4. Bo, Y., & Fowlkes, C. C. (2011). Shape-based pedestrian parsing. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  5. Carreira, J., Caseiro, R., Batista, J., & Sminchisescu, C. (2012). Semantic segmentation with second-order pooling. In European Conference on Computer Vision (ECCV).Google Scholar
  6. Carreira, J., & Sminchisescu, C. (2012). CPMC: Automatic object segmentation using constrained parametric min-cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 34(7), 1312–1328.CrossRefGoogle Scholar
  7. Chambolle, A., & Pock, T. (2011). A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision (JMIV), 40(1), 120–145.MathSciNetCrossRefMATHGoogle Scholar
  8. Delong, A., & Boykov, Y. (2009). Globally optimal segmentation of multi-region objects. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
  9. Delong, A., Gorelick, L., Veksler, O., & Boykov, Y. (2012). Minimizing energies with hierarchical costs. International Journal on Computer Vision (IJCV), 100(1), 38–58.MathSciNetCrossRefMATHGoogle Scholar
  10. Dice, L. R. (1945). Measures of the amount of ecologic association between species. Ecology, 26(3), 297–302.CrossRefGoogle Scholar
  11. Felzenszwalb, P. F., & Veksler, O. (2010). Tiered scene labeling with dynamic programming. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  12. Fröhlich, B., Rodner, E., & Denzler, J. (2012). Semantic segmentation with millions of features: Integrating multiple cues in a combined random forest approach. In Asian Conference on Computer Vision (ACCV).Google Scholar
  13. Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  14. Gould, S., Rodgers, J., Cohen, D., Elidan, G., & Koller, D. (2008). Multi-class segmentation with relative location prior. In International Journal on Computer Vision (IJCV).Google Scholar
  15. Kohli, P., Kumar, M. P., Torr, P. H. S.: P3 & beyond: Solving energies with higher order cliques. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2007).Google Scholar
  16. Kohli, P., Ladicky, L., & Torr, P. H. S. (2009). Robust higher order potentials for enforcing label consistency. International Journal on Computer Vision (IJCV), 82(3), 302–324.CrossRefGoogle Scholar
  17. Komodakis, N., & Paragios, N. (2009). Beyond pairwise energies: Efficient optimization for higher-order MRFs. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  18. Kontschieder, P., Kohli, P., Shotton, J., & Criminisi, A. (2013). Geof: Geodesic forests for learning coupled predictors. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  19. Korc, F., & Förstner, W. (2009). eTRIMS Image Database for interpreting images of man-made scenes. Technical Report, Department of Photogrammetry, University of Bonn.Google Scholar
  20. Ladicky, L., Russell, C., Kohli, P., & Torr, P. H. S. (2009). Associative hierarchical CRFs for object class image segmentation. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
  21. Ladicky, L., Russell, C., Kohli, P., & Torr, P. H. S. (2010). Graph cut based inference with co-occurrence statistics. In European Conference on Computer Vision (ECCV).Google Scholar
  22. Liu, X., Veksler, O., & Samarabandu, J. (2010). Order-preserving moves for graph-cut-based optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 32(7), 1182– 1196.CrossRefGoogle Scholar
  23. Lucchi, A., Li, Y., Boix, X., Smith, K., & Fua, P. (2011). Are spatial and global constraints really necessary for segmentation? In IEEE International Conference on Computer Vision (ICCV).Google Scholar
  24. Luo, P., Wang, X., & Tang, X. (2013). Pedestrian parsing via deep decompositional network. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
  25. Malisiewicz, T., Efros, A. A. (2007). Improving spatial support for objects via multiple segmentations. In British Machine Vision Conference (BMVC).Google Scholar
  26. Michelot, C. (1986). A finite algorithm for finding the projection of a point onto the canonical simplex of \({\mathbb{R}}^n\). Journal of Optimization Theory and Applications, 50(1), 195–200.MathSciNetCrossRefMATHGoogle Scholar
  27. Möllenhoff, T., Nieuwenhuis, C., Toeppe, E., & Cremers, D. (2013). Efficient convex optimization for minimal partition problems with volume constraints. In Conference on Energy Minimization Methods in Computer Vision and Pattern Recognition (EMMCVPR).Google Scholar
  28. Nieuwenhuis, C., & Cremers, D. (2013). Spatially varying color distributions for interactive multi-label segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 35(5), 1234–1247.CrossRefGoogle Scholar
  29. Nieuwenhuis, C., Strekalovskiy, E., & Cremers, D. (2013). Proportion priors for image sequence segmentation. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
  30. Nieuwenhuis, C., Töppe, E., & Cremers, D. (2013). A survey and comparison of discrete and continuous multi-label optimization approaches for the Potts model. International Journal on Computer Vision (IJCV), 104(3), 223–240.MathSciNetCrossRefMATHGoogle Scholar
  31. Nosrati, M., Andrews, S., & Hamarneh, G. (2013). Bounded labeling function for global segmentation of multi-part objects with geometric constraints. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
  32. Pock, T., & Chambolle, A. (2011). Diagonal preconditioning for first order primal-dual algorithms in convex optimization. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
  33. Pock, T., Chambolle, A., Bischof, H., & Cremers, D. (2009). A convex relaxation approach for computing minimal partitions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  34. Pock, T., Cremers, D., Bischof, H., & Chambolle, A. (2009). An algorithm for minimizing the Mumford–Shah functional. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
  35. Ramanan, D. (2006). Learning to parse images of articulated bodies. In Proceedings of Neural Information Processing Systems (pp. 1129–1136). Cambridge: MIT Press.Google Scholar
  36. Savarese, S., Winn, J., & Criminisi, A. (2006). Discriminative object class models of appearance and shape by correlatons. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  37. Shannon, C. E. (2001). A mathematical theory of communication. SIGMOBILE Mobile Computing and Communications Review, 5(1), 3–55.MathSciNetCrossRefGoogle Scholar
  38. Shotton, J., Winn, J., Rother, C., & Criminisi, A. (2006). Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In European Conference on Computer Vision (ECCV).Google Scholar
  39. Soille, P. (2003). Morphological image analysis: Principles and applications (2nd ed.). New York: Springer.MATHGoogle Scholar
  40. Souiai, M., Nieuwenhuis, C., Strekalovskiy, E., & Cremers, D. (2013). Convex optimization for scene understanding. In ICCV Workshop on Graphical Models for Scene Understanding.Google Scholar
  41. Souiai, M., Strekalovskiy, E., Nieuwenhuis, C., & Cremers, D. (2013). A co-occurrence prior for continuous multi-label optimization. In Conference on Energy Minimization Methods in Computer Vision and Pattern Recognition (EMMCVPR).Google Scholar
  42. Strekalovskiy, E., & Cremers, D. (2011). Generalized ordering constraints for multilabel optimization. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
  43. Strekalovskiy, E., Goldluecke, B., & Cremers, D. (2011). Tight convex relaxations for vector-valued labeling problems. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
  44. Strekalovskiy, E., Nieuwenhuis, C., & Cremers, D. (2012). Nonmetric priors for continuous multilabel optimization. In European Conference on Computer Vision (ECCV).Google Scholar
  45. Toeppe, E., Nieuwenhuis, C., & Cremers, D. (2013). Relative volume constraints for single view reconstruction. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  46. Toeppe, E., Oswald, M. R., Cremers, D., & Rother, C. (2010). Image-based 3d modeling via cheeger sets. In Asian Conference on Computer Vision (ACCV).Google Scholar
  47. Vezhnevets, A., Ferrari, V., & Buhmann, J. M. (2011). Weakly supervised semantic segmentation with a multi-image model. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
  48. Wang, L., Shi, J., Song, G., & Shang, I. F. (2007). Object detection combining recognition and segmentation. In Asian Conference on Computer Vision (ACCV).Google Scholar
  49. Yao, J., Fidler, S., & Urtasun, R. (2012). Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  50. Zach, C., Gallup, D., Frahm, J. M., & Niethammer, M. (2008). Fast global labeling for real-time stereo using multiple plane sweeps. In Vision, Modeling and Visualization Workshop (VMV).Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Julia Diebold
    • 1
  • Claudia Nieuwenhuis
    • 2
  • Daniel Cremers
    • 1
  1. 1.Technische Universität MünchenMunichGermany
  2. 2.ICSI, UC BerkeleyBerkeleyUSA

Personalised recommendations