Abstract
This chapter describes the approaches of the word level, which make use of the information based on word surfaces. The lexical information is very important for resolving ambiguous relationships for dependency parsing, but lexicalized statistics are sparse and difficult to estimate directly given a limited train data set. Thus, it is attractive to consider learning lexical information from large-scale unlabeled data, such as web data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The clusters are provided by Koo et al. (2008) that recovers at most 1,000 distinct bit strings.
- 2.
- 3.
We ensure that the text used for extracting subtrees does not include the sentences of the Penn Treebank.
References
Charniak, E., Blaheta, D., Ge, N., Hall, K., Hale, J., & Johnson, M. (2000). BLLIP 1987–89 WSJ Corpus Release 1, LDC2000T43. Linguistic Data Consortium.
Chen, W., Zhang, M., & Zhang, Y. (2013). Semi-supervised feature transformation for dependency parsing. In Proceedings of EMNLP, Seattle (pp. 1303–1313). Association for Computational Linguistics. http://www.aclweb.org/anthology/D13-1129.
Koo, T., Carreras, X., & Collins, M. (2008). Simple semi-supervised dependency parsing. In Proceedings of ACL-08: HLT, Columbus.
Marcus, M. P., Santorini, B., & Marcinkiewicz, M. A. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguisticss, 19(2), 313–330.
McDonald, R., & Nivre, J. (2007). Characterizing the errors of data-driven dependency parsing models. In Proceedings of EMNLP-CoNLL, Prague (pp. 122–131).
Miller, S., Guinness, J., & Zamanian, A. (2004). Name tagging with word clusters and discriminative training. In D. M. Susan Dumais & S. Roukos (Eds.), HLT-NAACL 2004: Main proceedings, Boston (pp. 337–342). Association for Computational Linguistics.
Ratnaparkhi, A. (1996). A maximum entropy model for part-of-speech tagging. In Proceedings of EMNLP 1996, Philadelphia (pp. 133–142). Copenhagen: Denmark.
Thorsten, B., & Franz, A. (2006). Web 1T 5-gram Version 1 LDC2006T13. Linguistic Data Consortium. https://catalog.ldc.upenn.edu/LDC2006T13.
Yamada, H., & Matsumoto, Y. (2003). Statistical dependency analysis with support vector machines. In Proceedings of IWPT, Nancy (pp. 195–206).
Zhou, G., Zhao, J., Liu, K., & Cai, L. (2011). Exploiting web-derived selectional preference to improve statistical dependency parsing. In Proceedings of ACL-HLT2011, Portland (pp. 1556–1565). Association for Computational Linguistics. http://www.aclweb.org/anthology/P11-1156.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media Singapore
About this chapter
Cite this chapter
Chen, W., Zhang, M. (2015). Training with Lexical Information. In: Semi-Supervised Dependency Parsing. Springer, Singapore. https://doi.org/10.1007/978-981-287-552-5_5
Download citation
DOI: https://doi.org/10.1007/978-981-287-552-5_5
Publisher Name: Springer, Singapore
Print ISBN: 978-981-287-551-8
Online ISBN: 978-981-287-552-5
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)