Entropy and Margin Maximization for Structured Output Learning

Pletscher, Patrick; Ong, Cheng Soon; Buhmann, Joachim M.

doi:10.1007/978-3-642-15939-8_6

Entropy and Margin Maximization for Structured Output Learning

Patrick Pletscher²³,
Cheng Soon Ong²³ &
Joachim M. Buhmann²³

Conference paper

3626 Accesses
6 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6323))

Abstract

We consider the problem of training discriminative structured output predictors, such as conditional random fields (CRFs) and structured support vector machines (SSVMs). A generalized loss function is introduced, which jointly maximizes the entropy and the margin of the solution. The CRF and SSVM emerge as special cases of our framework. The probabilistic interpretation of large margin methods reveals insights about margin and slack rescaling. Furthermore, we derive the corresponding extensions for latent variable models, in which training operates on partially observed outputs. Experimental results for multiclass, linear-chain models and multiple instance learning demonstrate that the generalized loss can improve accuracy of the resulting classifiers.

Download to read the full chapter text

Chapter PDF

References

Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: ICML (2001)
Google Scholar
Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.: Support vector machine learning for interdependent and structured output spaces. In: ICML, p. 104 (2004)
Google Scholar
Taskar, B., Guestrin, C., Koller, D.: Max-margin Markov networks. In: NIPS (2003)
Google Scholar
Bakir, G., Hofmann, T., Schölkopf, B., Smola, A., Taskar, B., Vishwanathan, S.V.N.: Predicting Structured Data. MIT Press, Cambridge (2007)
Google Scholar
Wainwright, M., Jordan, M.: Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning (2008)
Google Scholar
Zhang, T., Oles, F.J.: Text categorization based on regularized linear classification methods. Information Retrieval 4, 5–31 (2000)
Article Google Scholar
Collins, M., Globerson, A., Koo, T., Carreras, X., Bartlett, P.L.: Exponentiated gradient algorithms for conditional random fields and max-margin Markov networks. J. Mach. Learn. Res. 9, 1775–1822 (2008)
MathSciNet Google Scholar
Bartlett, P.L., Tewari, A.: Sparseness vs estimating conditional probabilities: Some asymptotic results. J. Mach. Learn. Res. 8, 775–790 (2007)
MathSciNet Google Scholar
Quattoni, A., Wang, S., Morency, L., Collins, M., Darrell, T.: Hidden-state conditional random fields. PAMI 29(10), 1848–1852 (2007)
Google Scholar
Yu, C., Joachims, T.: Learning structural SVMs with latent variables. In: ICML, pp. 1169–1176 (2009)
Google Scholar
Canu, S., Smola, A.J.: Kernel methods and the exponential family. Neurocomputing 69(7-9), 714–720 (2006)
Article Google Scholar
Chapelle, O., Zien, A.: Semi-supervised classification by low density separation. In: AISTATS, pp. 57–64 (2005)
Google Scholar
Zhang, T.: Class-size independent generalization analysis of some discriminative multi-category classification. In: NIPS, Cambridge, MA (2005)
Google Scholar
Shi, Q., Reid, M., Caetano, T.: Hybrid model of conditional random field and support vector machine. In: Workshop at NIPS (2009)
Google Scholar
Gimpel, K., Smith, N.: Softmax-margin crfs: Training log-linear models with cost functions. In: HLT, pp. 733–736 (2010)
Google Scholar
Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2 (2001)
Google Scholar
Mooij, J.: libDAI: A free/open source C++ library for Discrete Approximate Inference (2009)
Google Scholar
Ray, S., Craven, M.: Supervised versus multiple instance learning: An empirical comparison. In: ICML (2005)
Google Scholar
Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: NIPS, pp. 561–568 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, ETH Zürich, Switzerland
Patrick Pletscher, Cheng Soon Ong & Joachim M. Buhmann

Authors

Patrick Pletscher
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Soon Ong
View author publications
You can also search for this author in PubMed Google Scholar
Joachim M. Buhmann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Matemáticas, Estadística y Computación, Universidad de Cantabria, Avenida de los Castros, s/n, 39071, Santander, Spain
José Luis Balcázar
Yahoo! Research Barcelona, Avinguda Diagonal 177, 08018, Barcelona, Spain
Francesco Bonchi
Yahoo! Research Barcelona, Avinguda Diagnonal 177, 08018, Barcelona, Spain
Aristides Gionis
TAO, CNRS-INRIA-LRI, Université Paris-Sud, 91405, Orsay, France
Michèle Sebag

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pletscher, P., Ong, C.S., Buhmann, J.M. (2010). Entropy and Margin Maximization for Structured Output Learning. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2010. Lecture Notes in Computer Science(), vol 6323. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15939-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-15939-8_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15938-1
Online ISBN: 978-3-642-15939-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics