Weakly Supervised Discriminative Training of Linear Models for Natural Language Processing

Rojas-Barahona, Lina Maria; Cerisara, Christophe

doi:10.1007/978-3-319-25789-1_23

Lina Maria Rojas-Barahona¹⁶ &
Christophe Cerisara¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9449))

Included in the following conference series:

International Conference on Statistical Language and Speech Processing

654 Accesses

Abstract

This work explores weakly supervised training of discriminative linear classifiers. Such features-rich classifiers have been widely adopted by the Natural Language processing (NLP) community because of their powerful modeling capacity and their support for correlated features, which allow separating the expert task of designing features from the core learning method. However, unsupervised training of discriminative models is more challenging than with generative models. We adapt a recently proposed approximation of the classifier risk and derive a closed-form solution that greatly speeds-up its convergence time. This method is appealing because it provably converges towards the minimum risk without any labeled corpus, thanks to only two reasonable assumptions about the rank of class marginal and Gaussianity of class-conditional linear scores. We also show that the method is a viable, interesting alternative to achieve weakly supervised training of linear classifiers in two NLP tasks: predicate and entity recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This work has been partly funded by the ANR ContNomina project.
2.
http://nlp.stanford.edu/nlp.

References

Balasubramanian, K., Donmez, P., Lebanon, G.: Unsupervised supervised learning II: margin-based classification without labels. J. Mach. Learn. Res. 12, 3119–3145 (2011)
MathSciNet MATH Google Scholar
Björkelund, A., Hafdell, L., Nugues, P.: Multilingual semantic role labeling. In: Proceedings of CoNLL: Shared Task, pp. 43–48. Stroudsburg, PA, USA (2009)
Google Scholar
Daumé III, H.: Unsupervised search-based structured prediction. In: Proceedings of ICML, Montreal, Canada (2009)
Google Scholar
Druck, G., Mann, G., McCallum, A.: Semi-supervised learning of dependency parsers using generalized expectation criteria. In: Proceedings of ACL, pp. 360–368. Suntec, Singapore, August 2009
Google Scholar
Galliano, S., Gravier, G., Chaubard, L.: The ester 2 evaluation campaign for the rich transcription of french radio broadcasts. In: Proceedings of INTERSPEECH, pp. 2583–2586 (2009)
Google Scholar
Goldberg, A.B.: New directions in semi-supervised learning. Ph.D. thesis, University of Wisconsin-Madison (2010)
Google Scholar
Gould, H., Tobochnik, J.: An Introduction to Computer Simulation Methods: Applications to Physical Systems. Addison-Wesley, Series in physics (1988)
Google Scholar
Kaljahi, R.S.Z.: Adapting self-training for semantic role labeling. In: Proceedings Student Research Workshop, ACL, pp. 91–96. Uppsala, Sweden, July 2010
Google Scholar
Kapoor, A.: Learning Discriminative Models with Incomplete Data. Ph.D. thesis, Massachusetts Institute of Technology, February 2006
Google Scholar
Klein, D., Smarr, J., Nguyen, H., Manning, C.: Named entity recognition with character-level models. In: Proceedings of CoNLL, pp. 180–183. Stroudsburg, USA (2003)
Google Scholar
Li, Z., Wang, Z., Eisner, J., Khudanpur, S., Roark, B.: Minimum imputed-risk: unsupervised discriminative training for machine translation. In: Proceedings of EMNLP, pp. 920–929 (2011)
Google Scholar
Liu, X., Li, K., Zhou, M., Xiong, Z.: Enhancing semantic role labeling for tweets using self-training. In: Proceedings of AAAI, pp. 896–901 (2011)
Google Scholar
van der Plas, L., Samardžić, T., Merlo, P.: Cross-lingual validity of propbank in the manual annotation of french. In: Proceedings of the Fourth Linguistic Annotation Workshop, ACL. pp. 113–117. Uppsala, Sweden, July 2010
Google Scholar
Schmid, H.: Improvements in part-of-speech tagging with an application to german. In: Proceedings of the Workshop EACL SIGDAT, Dublin (1995)
Google Scholar
Smith, N.A., Eisner, J.: Unsupervised search-based structured prediction. In: Proceedings of ACL (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Université de Lorraine/LORIA, Nancy, France
Lina Maria Rojas-Barahona
CNRS/LORIA, Nancy, France
Christophe Cerisara

Authors

Lina Maria Rojas-Barahona
View author publications
You can also search for this author in PubMed Google Scholar
Christophe Cerisara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christophe Cerisara .

Editor information

Editors and Affiliations

Research Group on Mathematical Linguistic, Rovira i Virgili University, Tarragona, Spain
Adrian-Horia Dediu
Research Group on Mathematical Linguistic, Rovira i Virgili University, Tarragona, Spain
Carlos Martín-Vide
Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Budapest, Hungary
Klára Vicsi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rojas-Barahona, L.M., Cerisara, C. (2015). Weakly Supervised Discriminative Training of Linear Models for Natural Language Processing. In: Dediu, AH., Martín-Vide, C., Vicsi, K. (eds) Statistical Language and Speech Processing. SLSP 2015. Lecture Notes in Computer Science(), vol 9449. Springer, Cham. https://doi.org/10.1007/978-3-319-25789-1_23

Download citation

DOI: https://doi.org/10.1007/978-3-319-25789-1_23
Published: 17 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25788-4
Online ISBN: 978-3-319-25789-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics