Multi-utility Learning: Structured-Output Learning with Multiple Annotation-Specific Loss Functions

Shapovalov, Roman; Vetrov, Dmitry; Osokin, Anton; Kohli, Pushmeet

doi:10.1007/978-3-319-14612-6_30

Roman Shapovalov¹⁹,
Dmitry Vetrov¹⁹,
Anton Osokin^19,20 &
…
Pushmeet Kohli²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8932))

Included in the following conference series:

International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition

2571 Accesses

Abstract

Structured-output learning is a challenging problem; particularly so because of the difficulty in obtaining large datasets of fully labelled instances for training. In this paper we try to overcome this difficulty by presenting a multi-utility learning framework for structured prediction that can learn from training instances with different forms of supervision. We propose a unified technique for inferring the loss functions most suitable for quantifying the consistency of solutions with the given weak annotation. We demonstrate the effectiveness of our framework on the challenging semantic image segmentation problem for which a wide variety of annotations can be used. For instance, the popular training datasets for semantic segmentation are composed of images with hard-to-generate full pixel labellings, as well as images with easy-to-obtain weak annotations, such as bounding boxes around objects, or image-level labels that specify which object categories are present in an image. Experimental evaluation shows that the use of annotation-specific loss functions dramatically improves segmentation accuracy compared to the baseline system where only one type of weak annotation is used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. PAMI 23(11), 1222–1239 (2001)
Article Google Scholar
Delong, A., Osokin, A., Isack, H.N., Boykov, Y.: Fast Approximate Energy Minimization with Label Costs. IJCV 96(1), 1–27 (2012)
Article MATH MathSciNet Google Scholar
Heitz, G., Koller, D.: Learning spatial context: Using stuff to find things. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 30–43. Springer, Heidelberg (2008)
Chapter Google Scholar
Joachims, T., Finley, T., Yu, C.: Cutting-plane training of structural SVMs. Machine Learning 77(1), 27–59 (2009)
Article MATH Google Scholar
Kumar, M.P., Turki, H., Preston, D., Koller, D.: Learning specific-class segmentation from diverse data. In: ICCV, pp. 1800–1807 (November 2011)
Google Scholar
Ladický, Ľ., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, Where and How Many? Combining Object Detectors and CRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010)
Chapter Google Scholar
Lempitsky, V., Kohli, P., Rother, C., Sharp, T.: Image segmentation with a bounding box prior. In: ICCV, pp. 277–284 (September 2009)
Google Scholar
Liu, K., Raghavan, S., Nelesen, S., Linder, C.R., Warnow, T.: Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science (New York, N.Y.) 324(5934), 1561–1564 (2009)
Article Google Scholar
Pletscher, P., Kohli, P.: Learning low-order models for enforcing high-order statistics. In: AISTATS (2012)
Google Scholar
Quattoni, A., Wang, S., Morency, L.P., Collins, M., Darrell, T.: Hidden conditional random fields. PAMI 29(10), 1848–1853 (2007)
Article Google Scholar
Schwing, A.G., Hazan, T., Pollefeys, M., Urtasun, R.: Efficient Structured Prediction with Latent Variables for General Graphical Models. In: ICML (2012)
Google Scholar
Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: CVPR (June 2008)
Google Scholar
Shotton, J., Winn, J.M., Rother, C., Criminisi, A.: textonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 1–15. Springer, Heidelberg (2006)
Chapter Google Scholar
Tarlow, D., Zemel, R.S.: Structured Output Learning with High Order Loss Functions. In: AISTATS (2012)
Google Scholar
Taskar, B., Chatalbashev, V., Koller, D.: Learning associative Markov networks. In: ICML. pp. 102–109, Banff, Alberta, Canada (2004)
Google Scholar
Tighe, J., Lazebnik, S.: SuperParsing: Scalable Nonparametric Image Parsing with Superpixels. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 352–365. Springer, Heidelberg (2010)
Chapter Google Scholar
Tighe, J., Lazebnik, S.: Finding Things: Image Parsing with Regions and Per-Exemplar Detectors. In: CVPR, pp. 3001–3008 (June 2013)
Google Scholar
Torralba, A., Russel, B.C., Yuen, J.: LabelMe: Online Image Annotation and Applications. Proceedings of the IEEE 98(8), 1467–1484 (2010)
Article Google Scholar
Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. JMLR 6, 1453–1484 (2006)
MathSciNet Google Scholar
Vezhnevets, A., Ferrari, V., Buhmann, J.M.: Weakly Supervised Semantic Segmentation with a Multi-Image Model. In: ICCV, Barcelona, ES (2011)
Google Scholar
Vezhnevets, A., Ferrari, V., Buhmann, J.M.: Weakly Supervised Structured Output Learning for Semantic Segmentation. In: CVPR, Providence, RI (2012)
Google Scholar
Yao, J., Fidler, S., Urtasun, R.: Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation. In: CVPR (June 2012)
Google Scholar
Yu, C.N.J., Joachims, T.: Learning structural SVMs with latent variables. In: ICML, Montreal, Canada (2009)
Google Scholar
Yuille, A., Rangarajan, A.: The concave-convex procedure (CCCP). In: NIPS (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Lomonosov Moscow State University, Russia
Roman Shapovalov, Dmitry Vetrov & Anton Osokin
INRIA — SIERRA Project Team, Paris, France
Anton Osokin
Microsoft Research, Cambridge, UK
Pushmeet Kohli

Authors

Roman Shapovalov
View author publications
You can also search for this author in PubMed Google Scholar
Dmitry Vetrov
View author publications
You can also search for this author in PubMed Google Scholar
Anton Osokin
View author publications
You can also search for this author in PubMed Google Scholar
Pushmeet Kohli
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematics, University of Bergen, 7800, Bergen, Norway
Xue-Cheng Tai
Department of Mathematics, University of California, Los Angeles, CA, USA
Egil Bae
The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, S.A.R.
Tony F. Chan
Telemark University College, Postboks 203, 3901, Porsgrunn, Norway
Marius Lysaker

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shapovalov, R., Vetrov, D., Osokin, A., Kohli, P. (2015). Multi-utility Learning: Structured-Output Learning with Multiple Annotation-Specific Loss Functions. In: Tai, XC., Bae, E., Chan, T.F., Lysaker, M. (eds) Energy Minimization Methods in Computer Vision and Pattern Recognition. EMMCVPR 2015. Lecture Notes in Computer Science, vol 8932. Springer, Cham. https://doi.org/10.1007/978-3-319-14612-6_30

Download citation

DOI: https://doi.org/10.1007/978-3-319-14612-6_30
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14611-9
Online ISBN: 978-3-319-14612-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics