The Effect of Semi-supervised Learning on Parsing Long Distance Dependencies in German and Swedish

Søgaard, Anders; Rishøj, Christian

doi:10.1007/978-3-642-14770-8_44

Anders Søgaard²² &
Christian Rishøj²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6233))

Included in the following conference series:

International Conference on Natural Language Processing

1162 Accesses

Abstract

This paper shows how the best data-driven dependency parsers available today [1] can be improved by learning from unlabeled data. We focus on German and Swedish and show that labeled attachment scores improve by 1.5%-2.5%. Error analysis shows that improvements are primarily due to better recovery of long distance dependencies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Martins, A., Das, D., Smith, N., Xing, E.: Stacking dependency parsers. In: EMNLP, Honolulu, Hawaii (2008)
Google Scholar
Rimell, L., Clark, S., Steedman, M.: Unbounded dependency recovery for parser evaluation. In: EMNLP, Singapore (2009)
Google Scholar
Abney, S.: Semi-supervised learning for computational linguistics. Chapman and Hall, Boca Raton (2008)
Google Scholar
Wolpert, D.: Stacked generalization. Neural Networks 5, 241–259 (1992)
Article Google Scholar
Sagae, K., Lavie, A.: Parser combination by reparsing. In: HLT-NAACL, New York City, NY (2006)
Google Scholar
Hall, J.: colleagues: Single malt or blended? In: CONLL, Prague, Czech Republic (2007)
Google Scholar
Nivre, J., McDonald, R.: Integrating graph-based and transition-based dependency parsers. In: ACL-HLT, Columbus, Ohio (2008)
Google Scholar
Fishel, M., Nivre, J.: Voting and stacking in data-driven dependency parsing. In: NODALIDA, Odense, Denmark (2009)
Google Scholar
Surdeanu, M., Manning, C.: Ensemble models for dependency parsing: cheap and good? In: NAACL, Los Angeles, CA (2010)
Google Scholar
Li, M., Zhou, Z.H.: Tri-training: exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering 17(11), 1529–1541 (2005)
Article Google Scholar
Koo, T., Carreras, X., Collins, M.: Simple semi-supervised dependency parsing. In: ACL, Columbus, Ohio (2008)
Google Scholar
Wang, Q., Lin, D., Schuurmans, D.: Semi-supervised convex training for dependency parsing. In: ACL, Columbus, Ohio (2008)
Google Scholar
Suzuki, J., Isozaki, H., Carreras, X., Collins, M.: Semi-supervised convex training for dependency parsing. In: EMNLP, Singapore (2009)
Google Scholar
Sagae, K., Tsujii, J.: Dependency parsing and domain adaptation with lr models and parser ensembles. In: EMNLP-CONLL, Prague, Czech Republic (2007)
Google Scholar
Chen, W., Zhang, Y., Isahara, H.: Chinese chunking with tri-training learning. In: Computer processing of oriental languages, pp. 466–473. Springer, Berlin (2006)
Chapter Google Scholar
Nguyen, T., Nguyen, L., Shimazu, A.: Using semi-supervised learning for question classification. Journal of Natural Language Processing 15, 3–21 (2008)
Google Scholar
Sindhwani, V., Keerthi, S.: Large scale semi-supervised linear SVMs. In: ACM SIGIR, Seattle, WA (2006)
Google Scholar
McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-projective dependency parsing using spanning tree algorithms. In: HLT-EMNLP 2005, Vancouver, British Columbia (2005)
Google Scholar
Nivre, J.: Colleagues: MaltParser. Natural Language Engineering 13(2), 95–135 (2007)
Google Scholar
Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)
Article MATH Google Scholar
Brants, S., Hansen, S., Lezius, W., Smith, G.: The TIGER treebank. In: TLT, Sozopol, Bulgaria (2002)
Google Scholar
Nilsson, J., Hall, J., Nivre, J.: MAMBA meets TIGER: Reconstructing a Swedish treebank from antiquity. In: NODALIDA, Joensuu, Finland (2005)
Google Scholar
Gimenez, J., Marquez, L.: SVMTool: a general POS tagger generator based on support vector machines. In: LREC, Lisbon, Portugal (2004)
Google Scholar
Eisner, J.: Three new probabilistic models for dependency parsing. In: COLING, Copenhagen, Denmark (1996)
Google Scholar
Zeman, D., Žabokrtský, Z.: Improving parsing accuracy by combining diverse dependency parsers. In: IWPT, Vancouver, Canada (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Language Technology, University of Copenhagen, Njalsgade 140–142, DK-2300, Copenhagen S
Anders Søgaard & Christian Rishøj

Authors

Anders Søgaard
View author publications
You can also search for this author in PubMed Google Scholar
Christian Rishøj
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science, Reykjavik University, Kringlan 1, 103, Reykjavik, Iceland
Hrafn Loftsson
Department of Icelandic, University of Iceland, Árnagardur v/Sudurgötu, 101, Reykjavik, Iceland
Eiríkur Rögnvaldsson
Arni Magnusson Institute for Icelandic Studies, Neshagi 16, 101, Reykjavik, Iceland
Sigrún Helgadóttir

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Søgaard, A., Rishøj, C. (2010). The Effect of Semi-supervised Learning on Parsing Long Distance Dependencies in German and Swedish. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds) Advances in Natural Language Processing. NLP 2010. Lecture Notes in Computer Science(), vol 6233. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14770-8_44

Download citation

DOI: https://doi.org/10.1007/978-3-642-14770-8_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14769-2
Online ISBN: 978-3-642-14770-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics