Skip to main content
Log in

Discriminative Training of Conditional Random Fields with Probably Submodular Constraints

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Problems of segmentation, denoising, registration and 3D reconstruction are often addressed with the graph cut algorithm. However, solving an unconstrained graph cut problem is NP-hard. For tractable optimization, pairwise potentials have to fulfill the submodularity inequality. In our learning paradigm, pairwise potentials are created as the dot product of a learned vector w with positive feature vectors. In order to constrain such a model to remain tractable, previous approaches have enforced the weight vector to be positive for pairwise potentials in which the labels differ, and set pairwise potentials to zero in the case that the label remains the same. Such constraints are sufficient to guarantee that the resulting pairwise potentials satisfy the submodularity inequality. However, we show that such an approach unnecessarily restricts the capacity of the learned models. Guaranteeing submodularity for all possible inputs, no matter how improbable, reduces inference error to effectively zero, but increases model error. In contrast, we relax the requirement of guaranteed submodularity to solutions that are probably approximately submodular. We show that the conceptually simple strategy of enforcing submodularity on the training examples guarantees with low sample complexity that test images will also yield submodular pairwise potentials. Results are presented in the binary and muticlass settings, showing substantial improvement from the resulting increased model capacity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. In the following, we write \(\varvec{\phi }_u(x^k)\) and \(\varvec{\phi }_p(x^k,x^l)\) as a shorthand for \(\varvec{\phi }_u^{k}(x^k,x^l)\) and \(\varvec{\phi }_p^{k,l}(x^k,x^l)\).

References

  • Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Susstrunk, S. (2012). SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2274–2282.

    Article  Google Scholar 

  • Anguelov, D., Taskar, B., Chatalbashev, V., Koller, D., Gupta, D., Heitz, G., et al. (2005). Discriminative learning of Markov random fields for segmentation of 3D scan data. In Proceedings of the IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 169–176).

  • Bakır, G. H., Hofmann, T., Schölkopf, B., Smola, A. J., Taskar, B., & Vishwanathan, S. V. N. (2007). Predicting structured data. Cambridge: The MIT Press.

    Book  Google Scholar 

  • Barahona, F. (1982). On the computational complexity of ising spin glass models. Journal of Physics A: Mathematical and General, 15(10), 3241.

    Article  MathSciNet  Google Scholar 

  • Bartlett, P. L., Jordan, M. I., & McAuliffe, J. D. (2006). Convexity, classification, and risk bounds. Journal of the American Statistical Association, 101(473), 138–156.

    Article  MathSciNet  Google Scholar 

  • Billingsley, P. (1995). Probability and measure. New York: Wiley.

    MATH  Google Scholar 

  • Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In ICCV (pp 105–112).

  • Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.

    Article  Google Scholar 

  • Casser, V., Kang, K. C., Pfister, H., & Haehn, D. (2018). Fast mitochondria segmentation for connectomics. CoRR arXiv:1812.06024.

  • Chandra, S., & Kokkinos, I. (2016). Fast, exact and multi-scale inference for semantic image segmentation with deep Gaussian CRFs. In European conference on computer vision.

  • Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848.

    Article  Google Scholar 

  • Chen, L. C., Schwing, A. G., Yuille, A. L., & Urtasun, R. (2015). Learning deep structured models. In Proceedings of ICML.

  • Cheng, H. C., & Varshney, A. (2017). Volume segmentation using convolutional neural networks with limited training data. In 2017 IEEE international conference on image processing (ICIP) (pp. 590–594).

  • Elisseeff, A., & Weston, J. (2001). A kernel method for multi-labelled classification. In NIPS.

  • Felzenszwalb, P. F., Girshick, R. B., & McAllester, D. (2010). Cascade object detection with deformable part models. In CVPR.

  • Finley, T., & Joachims, T. (2008). Training structural SVMs when exact inference is intractable. In ICML.

  • Gelasca, E. D., Byun, J., Obara, B., & Manjunath, B. (2008). Evaluation and benchmark for biological image segmentation. In 15th IEEE international conference on image processing, 2008. ICIP 2008 (pp. 1816–1819). IEEE.

  • Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6), 721–741.

    Article  Google Scholar 

  • Gjorgjevikj, D., & Madjarov, G. (2011). Two stage classifier chain architecture for efficient pair-wise multi-label learning. In 2011 IEEE international workshop on machine learning for signal processing (pp 1–6).

  • Haralick, R. M., Shanmugam, K. S., & Dinstein, I. (1973). Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics, 3, 610–621.

    Article  Google Scholar 

  • Har-Peled, S. (2011). On the expected complexity of random convex hulls. CoRR arXiv:1111.5340.

  • Joachims, T., Finley, T., & Yu, C. N. J. (2009). Cutting-plane training of structural SVMs. Machine Learning, 77(1), 27–59. https://doi.org/10.1007/s10994-009-5108-8.

    Article  MATH  Google Scholar 

  • Kolmogorov, V. (2005). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1568–1583.

    Article  Google Scholar 

  • Kolmogorov, V., & Zabih, R. (2004). What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 147–159.

    Article  Google Scholar 

  • Lafferty, J. D., McCallum, A., & Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the eighteenth international conference on machine learning (pp 282–289).

  • Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In Workshop on statistical learning in computer vision, ECCV.

  • Lucchi, A., Márquez-Neila, P., Becker, C., Li, Y., Smith, K., Knott, G., et al. (2015). Learning structured models for segmentation of 2-d and 3-d imagery. IEEE Transactions on Medical Imaging, 34(5), 1096–1110.

    Article  Google Scholar 

  • Lucchi, A., Smith, K., Achanta, R., Knott, G., & Fua, P. (2012). Supervoxel-based segmentation of mitochondria in EM image stacks with learned shape features. IEEE Transactions on Medical Imaging, 31(2), 474–486.

    Article  Google Scholar 

  • Magnus, J. R., & Neudecker, H. (1995). Matrix differential calculus with applications in statistics and econometrics. New York: Wiley.

    MATH  Google Scholar 

  • Metzen, J. H., Kumar, M. C., Brox, T., & Fischer, V. (2017). Universal adversarial perturbations against semantic image segmentation. In The IEEE international conference on computer vision (ICCV).

  • Müller, A. C., & Behnke, S. (2014). PyStruct—Learning structured prediction in python. Journal of Machine Learning Research, 15, 2055–2060.

    MathSciNet  MATH  Google Scholar 

  • Nowozin, S., Gehler, P. V., & Lampert, C. H. (2010). On parameter learning in CRF-based approaches to object class image segmentation. ECCV, 6, 98–111.

    Google Scholar 

  • Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In N. Navab, J. Hornegger, W. M. Wells, & A. F. Frangi (Eds.), Medical image computing and computer-assisted intervention (pp. 234–241). Berlin: Springer.

    Google Scholar 

  • Rother, C., Kolmogorov, V., Lempitsky, V., & Szummer, M. (2007). Optimizing binary MRFs via extended roof duality. In 2007 IEEE conference on computer vision and pattern recognition (pp. 1–8). IEEE.

  • Rother, C., Kumar, S., Kolmogorov, V., & Blake, A. (2005). Digital tapestry [automatic image synthesis]. In IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005 (Vol. 1, pp. 589–596). IEEE.

  • Szummer, M., Kohli, P., & Hoiem, D. (2008). Learning CRFs using graph cuts. In D. A. Forsyth, P. H. S. Torr, & A. Zisserman (Eds.) European conference on computer vision. Lecture notes in computer science (Vol. 5303, pp. 582–595). Berlin: Springer.

  • Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2005). Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6, 1453–1484.

    MathSciNet  MATH  Google Scholar 

  • Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27(11), 1134–1142.

    Article  Google Scholar 

  • Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.

    MATH  Google Scholar 

  • Zaremba, W., & Blaschko, M. B. (2016). Discriminative training of CRF models with probably submodular constraints. In IEEE winter conference on applications of computer vision.

Download references

Acknowledgements

This work is partially funded by Internal Funds KU Leuven and an Amazon Research Award. This work made use of a hardware donation from the Facebook GPU Partnership program, and an NVIDIA GPU donation. We acknowledge support from the Research Foundation - Flanders (FWO) through Project Number G0A2716N. This research received funding from the Flemish Government under the Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen programme. The authors thank Wojciech Zaremba for making this work possible.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maxim Berman.

Additional information

Communicated by Christoph H. Lampert.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Berman, M., Blaschko, M.B. Discriminative Training of Conditional Random Fields with Probably Submodular Constraints. Int J Comput Vis 128, 1722–1735 (2020). https://doi.org/10.1007/s11263-019-01277-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-019-01277-y

Keywords

Navigation