Skip to main content

The Effect of Semi-supervised Learning on Parsing Long Distance Dependencies in German and Swedish

  • Conference paper
Advances in Natural Language Processing (NLP 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6233))

Included in the following conference series:

  • 1162 Accesses

Abstract

This paper shows how the best data-driven dependency parsers available today [1] can be improved by learning from unlabeled data. We focus on German and Swedish and show that labeled attachment scores improve by 1.5%-2.5%. Error analysis shows that improvements are primarily due to better recovery of long distance dependencies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Martins, A., Das, D., Smith, N., Xing, E.: Stacking dependency parsers. In: EMNLP, Honolulu, Hawaii (2008)

    Google Scholar 

  2. Rimell, L., Clark, S., Steedman, M.: Unbounded dependency recovery for parser evaluation. In: EMNLP, Singapore (2009)

    Google Scholar 

  3. Abney, S.: Semi-supervised learning for computational linguistics. Chapman and Hall, Boca Raton (2008)

    Google Scholar 

  4. Wolpert, D.: Stacked generalization. Neural Networks 5, 241–259 (1992)

    Article  Google Scholar 

  5. Sagae, K., Lavie, A.: Parser combination by reparsing. In: HLT-NAACL, New York City, NY (2006)

    Google Scholar 

  6. Hall, J.: colleagues: Single malt or blended? In: CONLL, Prague, Czech Republic (2007)

    Google Scholar 

  7. Nivre, J., McDonald, R.: Integrating graph-based and transition-based dependency parsers. In: ACL-HLT, Columbus, Ohio (2008)

    Google Scholar 

  8. Fishel, M., Nivre, J.: Voting and stacking in data-driven dependency parsing. In: NODALIDA, Odense, Denmark (2009)

    Google Scholar 

  9. Surdeanu, M., Manning, C.: Ensemble models for dependency parsing: cheap and good? In: NAACL, Los Angeles, CA (2010)

    Google Scholar 

  10. Li, M., Zhou, Z.H.: Tri-training: exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering 17(11), 1529–1541 (2005)

    Article  Google Scholar 

  11. Koo, T., Carreras, X., Collins, M.: Simple semi-supervised dependency parsing. In: ACL, Columbus, Ohio (2008)

    Google Scholar 

  12. Wang, Q., Lin, D., Schuurmans, D.: Semi-supervised convex training for dependency parsing. In: ACL, Columbus, Ohio (2008)

    Google Scholar 

  13. Suzuki, J., Isozaki, H., Carreras, X., Collins, M.: Semi-supervised convex training for dependency parsing. In: EMNLP, Singapore (2009)

    Google Scholar 

  14. Sagae, K., Tsujii, J.: Dependency parsing and domain adaptation with lr models and parser ensembles. In: EMNLP-CONLL, Prague, Czech Republic (2007)

    Google Scholar 

  15. Chen, W., Zhang, Y., Isahara, H.: Chinese chunking with tri-training learning. In: Computer processing of oriental languages, pp. 466–473. Springer, Berlin (2006)

    Chapter  Google Scholar 

  16. Nguyen, T., Nguyen, L., Shimazu, A.: Using semi-supervised learning for question classification. Journal of Natural Language Processing 15, 3–21 (2008)

    Google Scholar 

  17. Sindhwani, V., Keerthi, S.: Large scale semi-supervised linear SVMs. In: ACM SIGIR, Seattle, WA (2006)

    Google Scholar 

  18. McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-projective dependency parsing using spanning tree algorithms. In: HLT-EMNLP 2005, Vancouver, British Columbia (2005)

    Google Scholar 

  19. Nivre, J.: Colleagues: MaltParser. Natural Language Engineering 13(2), 95–135 (2007)

    Google Scholar 

  20. Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  21. Brants, S., Hansen, S., Lezius, W., Smith, G.: The TIGER treebank. In: TLT, Sozopol, Bulgaria (2002)

    Google Scholar 

  22. Nilsson, J., Hall, J., Nivre, J.: MAMBA meets TIGER: Reconstructing a Swedish treebank from antiquity. In: NODALIDA, Joensuu, Finland (2005)

    Google Scholar 

  23. Gimenez, J., Marquez, L.: SVMTool: a general POS tagger generator based on support vector machines. In: LREC, Lisbon, Portugal (2004)

    Google Scholar 

  24. Eisner, J.: Three new probabilistic models for dependency parsing. In: COLING, Copenhagen, Denmark (1996)

    Google Scholar 

  25. Zeman, D., Žabokrtský, Z.: Improving parsing accuracy by combining diverse dependency parsers. In: IWPT, Vancouver, Canada (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Søgaard, A., Rishøj, C. (2010). The Effect of Semi-supervised Learning on Parsing Long Distance Dependencies in German and Swedish. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds) Advances in Natural Language Processing. NLP 2010. Lecture Notes in Computer Science(), vol 6233. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14770-8_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14770-8_44

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14769-2

  • Online ISBN: 978-3-642-14770-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics