Max-Margin Learning of Deep Structured Models for Semantic Segmentation

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10270)

Abstract

During the last few years most work done on the task of image segmentation has been focused on deep learning and Convolutional Neural Networks (CNNs) in particular. CNNs are powerful for modeling complex connections between input and output data but lack the ability to directly model dependent output structures, for instance, enforcing properties such as smoothness and coherence. This drawback motivates the use of Conditional Random Fields (CRFs), widely applied as a post-processing step in semantic segmentation.

In this paper, we propose a learning framework that jointly trains the parameters of a CNN paired with a CRF. For this, we develop theoretical tools making it possible to optimize a max-margin objective with back-propagation. The max-margin loss function gives the model good generalization capabilities. Thus, the method is especially suitable for applications where labelled data is limited, for example, medical applications. This generalization capability is reflected in our results where we are able to show good performance on two relatively small medical datasets. The method is also evaluated on a public benchmark (frequently used for semantic segmentation) yielding results competitive to state-of-the-art. Overall, we demonstrate that end-to-end max-margin training is preferred over piecewise training when combining a CNN with a CRF.

Keywords

Segmentation Convolutional Neural Networks Markov random fields 

Supplementary material

450651_1_En_3_MOESM1_ESM.pdf (1.5 mb)
Supplementary material 1 (pdf 1562 KB)

References

  1. 1.
    Bergström, G., et al.: The Swedish CArdioPulmonary bioImage Study: objectives and design. J. Internal Med. 278(6), 645–659 (2015)CrossRefGoogle Scholar
  2. 2.
    Borenstein, E., Ullman, S.: Class-specific, top-down segmentation. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2351, pp. 109–122. Springer, Heidelberg (2002). doi:10.1007/3-540-47967-8_8 CrossRefGoogle Scholar
  3. 3.
    Brosch, T., Tang, L.Y.W., Yoo, Y., Li, D.K.B., Traboulsee, A., Tam, R.: Deep 3D convolutional encoder networks with shortcuts for multiscale feature integration applied to multiple sclerosis lesion segmentation. IEEE Trans. Med. Imag. 35(5), 1229–1239 (2016)CrossRefGoogle Scholar
  4. 4.
    Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: International Conference on Learning Representations (2015)Google Scholar
  5. 5.
    Chen, L.-C., Schwing, A.G., Yuille, A.L., Urtasun, R.: Learning deep structured models. In: International Conference on Machine Learning (2015)Google Scholar
  6. 6.
    Cireşan, D.C., Giusti, A., Gambardella, L.M., Schmidhuber, J.: Mitosis detection in breast cancer histology images with deep neural networks. In: Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds.) MICCAI 2013. LNCS, vol. 8150, pp. 411–418. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40763-5_51 CrossRefGoogle Scholar
  7. 7.
    Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Conference on Computer Vision and Pattern Recognition (2009)Google Scholar
  8. 8.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Conference on Computer Vision and Pattern Recognition (2014)Google Scholar
  9. 9.
    Giusti, A., Cireşan, D.C., Masci, J., Gambardella, L.M., Schmidhuber, J.: Fast image scanning with deep max-pooling convolutional neural networks. In: International Conference on Image Processing (2013)Google Scholar
  10. 10.
    Kolmogorov, V., Zabin, R.: What energy functions can be minimized via graph cuts? IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 147–159 (2004)CrossRefGoogle Scholar
  11. 11.
    Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: Advances in Neural Information Processing Systems (2011)Google Scholar
  12. 12.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)Google Scholar
  13. 13.
    Lang, R.M., et al.: Recommendations for cardiac chamber quantification by echocardiography in adults: an update from the american society of echocardiography and the european association of cardiovascular imaging. J. Am. Soc. Echocardiogr. 28(1), 1–39 (2015)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Larsson, M., Arnab, A., Kahl, F., Zheng, S., Torr, P.: Learning arbitrary pairwise potentials in CRFs for semantic segmentation. arXiv preprint (2017)Google Scholar
  15. 15.
    Lin, G., Shen, C., van den Hengel, A., Reid, I.: Efficient piecewise training of deep structured models for semantic segmentation. In: Conference on Computer Vision and Pattern Recognition (2016)Google Scholar
  16. 16.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Conference on Computer Vision and Pattern Recognition (2015)Google Scholar
  17. 17.
    Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In International Conference on Computer Vision (2015)Google Scholar
  18. 18.
    Norlén, A., Alvén, J., Molnar, D., Enqvist, O., Norrlund, R.R., Brandberg, J., Bergström, G., Kahl, F.: Automatic pericardium segmentation and quantification of epicardial fat from computed tomography angiography. J. Med. Imaging 3(3) (2016)Google Scholar
  19. 19.
    Prasoon, A., Petersen, K., Igel, C., Lauze, F., Dam, E., Nielsen, M.: Deep feature learning for knee cartilage segmentation using a triplanar convolutional neural network. In: Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds.) MICCAI 2013. LNCS, vol. 8150, pp. 246–253. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40763-5_31 CrossRefGoogle Scholar
  20. 20.
    Ranzato, M., Taylor, P.E., House, J.M., Flagan, R.C., LeCun, Y., Perona, P.: Automatic recognition of biological particles in microscopic images. Pattern Recogn. Lett. 28(1), 31–39 (2007)CrossRefGoogle Scholar
  21. 21.
    Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). doi:10.1007/978-3-319-24574-4_28 CrossRefGoogle Scholar
  22. 22.
    Roth, H.R., Lu, L., Farag, A., Shin, H.-C., Liu, J., Turkbey, E.B., Summers, R.M.: DeepOrgan: multi-level deep convolutional networks for automated pancreas segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9349, pp. 556–564. Springer, Cham (2015). doi:10.1007/978-3-319-24553-9_68 CrossRefGoogle Scholar
  23. 23.
    Szummer, M., Kohli, P., Hoiem, D.: Learning CRFs using graph cuts. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5303, pp. 582–595. Springer, Heidelberg (2008). doi:10.1007/978-3-540-88688-4_43 CrossRefGoogle Scholar
  24. 24.
    Tompson, J.J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in Neural Information Processing Systems (2014)Google Scholar
  25. 25.
    Vedaldi, A., Lenc, K.: MatConvNet: convolutional neural networks for MATLAB. In: International Conference on Multimedia (2015)Google Scholar
  26. 26.
    Visin, F., Ciccone, M., Romero, A., Kastner, K., Cho, K., Bengio, Y., Matteucci, M., Courville, A.: ReSeg: a recurrent neural network-based model for semantic segmentation. In: Conference on Computer Vision and Pattern Recognition Workshops (2016)Google Scholar
  27. 27.
    Yang, J., Price, B., Cohen, S., Lin, Z., Yang, M.-H.: PatchCut: Data-driven object segmentation via local shape transfer. In: Conference on Multimedia Computer Vision and Pattern Recognition (2015)Google Scholar
  28. 28.
    Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.S.: Conditional random fields as recurrent neural networks. In: International Conference on Computer Vision (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Chalmers University of TechnologyGothenburgSweden
  2. 2.Centre for Mathematical SciencesLund UniversityLundSweden

Personalised recommendations