Skip to main content

Training with Lexical Information

  • Chapter
Semi-Supervised Dependency Parsing
  • 352 Accesses

Abstract

This chapter describes the approaches of the word level, which make use of the information based on word surfaces. The lexical information is very important for resolving ambiguous relationships for dependency parsing, but lexicalized statistics are sparse and difficult to estimate directly given a limited train data set. Thus, it is attractive to consider learning lexical information from large-scale unlabeled data, such as web data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The clusters are provided by Koo et al. (2008) that recovers at most 1,000 distinct bit strings.

  2. 2.

    http://w3.msi.vxu.se/~nivre/research/Penn2Malt.html

  3. 3.

    We ensure that the text used for extracting subtrees does not include the sentences of the Penn Treebank.

References

  • Charniak, E., Blaheta, D., Ge, N., Hall, K., Hale, J., & Johnson, M. (2000). BLLIP 1987–89 WSJ Corpus Release 1, LDC2000T43. Linguistic Data Consortium.

    Google Scholar 

  • Chen, W., Zhang, M., & Zhang, Y. (2013). Semi-supervised feature transformation for dependency parsing. In Proceedings of EMNLP, Seattle (pp. 1303–1313). Association for Computational Linguistics. http://www.aclweb.org/anthology/D13-1129.

  • Koo, T., Carreras, X., & Collins, M. (2008). Simple semi-supervised dependency parsing. In Proceedings of ACL-08: HLT, Columbus.

    Google Scholar 

  • Marcus, M. P., Santorini, B., & Marcinkiewicz, M. A. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguisticss, 19(2), 313–330.

    Google Scholar 

  • McDonald, R., & Nivre, J. (2007). Characterizing the errors of data-driven dependency parsing models. In Proceedings of EMNLP-CoNLL, Prague (pp. 122–131).

    Google Scholar 

  • Miller, S., Guinness, J., & Zamanian, A. (2004). Name tagging with word clusters and discriminative training. In D. M. Susan Dumais & S. Roukos (Eds.), HLT-NAACL 2004: Main proceedings, Boston (pp. 337–342). Association for Computational Linguistics.

    Google Scholar 

  • Ratnaparkhi, A. (1996). A maximum entropy model for part-of-speech tagging. In Proceedings of EMNLP 1996, Philadelphia (pp. 133–142). Copenhagen: Denmark.

    Google Scholar 

  • Thorsten, B., & Franz, A. (2006). Web 1T 5-gram Version 1 LDC2006T13. Linguistic Data Consortium. https://catalog.ldc.upenn.edu/LDC2006T13.

  • Yamada, H., & Matsumoto, Y. (2003). Statistical dependency analysis with support vector machines. In Proceedings of IWPT, Nancy (pp. 195–206).

    Google Scholar 

  • Zhou, G., Zhao, J., Liu, K., & Cai, L. (2011). Exploiting web-derived selectional preference to improve statistical dependency parsing. In Proceedings of ACL-HLT2011, Portland (pp. 1556–1565). Association for Computational Linguistics. http://www.aclweb.org/anthology/P11-1156.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer Science+Business Media Singapore

About this chapter

Cite this chapter

Chen, W., Zhang, M. (2015). Training with Lexical Information. In: Semi-Supervised Dependency Parsing. Springer, Singapore. https://doi.org/10.1007/978-981-287-552-5_5

Download citation

Publish with us

Policies and ethics