Strictly Lexicalised Dependency Parsing

Wang, Qin Iris; Schuurmans, Dale; Lin, Dekang

doi:10.1007/978-90-481-9352-3_7

Qin Iris Wang⁴,
Dale Schuurmans⁵ &
Dekang Lin⁶

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 43))

547 Accesses
4 Citations

Abstract

There has been a great deal of progress in statistical parsing in the past decade (Collins, 1996, 1997; Charniak, 2000). A common characteristic of these previous generative parsers is their use of lexical statistics. However, it was subsequently discovered that bi-lexical statistics (parameters that involve two words) actually play a much smaller role than previously believed. It has been found by Gildea (2001) that the removal of bi-lexical statistics from a state-of-the-art PCFG parser resulted in little change in the output. Bikel (2004) observes that only 1.49% of the bi-lexical statistics needed in parsing were found in the training corpus. When considering only bigram statistics involved in the highest probability parse, this percentage becomes 28.8%. However, even when bi-lexical statistics do get used, they are remarkably similar to their back-off values using part-of-speech tags. Therefore, the utility of bi-lexical statistics becomes rather questionable. Klein and Manning (2003) present an unlexicalized parser that eliminates all lexical parameters, with a performance score close to the state-of-the-art lexicalised parsers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Although non-projective trees exist, the dependency trees used in our experiments are projective trees that are converted from the Penn Chinese Treebank.
2.
We also computed directed dependency accuracy, which is defined as the percentage of words that have the correct head. We observed that the directed dependency accuracy is only slightly lower than the undirected one.

References

Bikel, D.M. (2004). Intricacies of Collins’ parsing model. Computational Linguistics 30(4), 479–511.
Article Google Scholar
Bikel, D.M. and D. Chiang (2000). Two statistical parsing models applied to the chinese treebank. In Proceedings of the 2nd Chinese Language Processing Workshop, Hong Kong.
Google Scholar
Charniak, E. (2000). A maximum entropy inspired parser. In Proceedings of North American Annual Meeting of the Association for Computational Linguistics, Seattle, Washington, pp. 132–139.
Google Scholar
Clark, S., J. Hockenmaier, and M. Steedman (2002). Building deep dependency structures with a wide-coverage CCG parser. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA.
Google Scholar
Collins, M. (1996). A new statistical parser based on bigram lexical dependencies. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Santa Cruz, California, pp. 184–191.
Google Scholar
Collins, M. (1997). Three generative, lexicalized models for statistical parsing. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Madrid, pp. 16–23.
Google Scholar
Collins, M. (1999). Head-driven statistical models for natural language parsing. Ph. D. thesis, University of Pennsylvania, Pennsylvania, PA.
Google Scholar
Dagan, I., L. Lee, and F. Pereira (1999). Similarity-based models of word cooccurrence probabilities. Machine Learning 34(1–3), 43–69.
Article Google Scholar
Eisner, J. (1996). Three new probabilistic models for dependency parsing: an exploration. In Proceedings of the International Conference on Computational Linguistics, Copenhagen.
Google Scholar
Eisner, J. and G. Satta (1999). Efficient parsing for bilexical context-free grammars and head- automaton grammars. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Maryland.
Google Scholar
Gildea, D. (2001). Corpus variation and parser performance. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Pittsburgh, PA.
Google Scholar
Graff, D. (2003). English Gigaword. Philadelphia, PA: Linguistic Data Consortium.
Google Scholar
Graff, D. and K. Chen (2003). Chinese Gigaword. Philadelphia, PA: Linguistic Data Consortium.
Google Scholar
Grefenstette, G. (1994). Corpus-derived first, second and third-order word affinities. In Proceedings of Euralex, Amsterdam.
Google Scholar
Harris, Z. (1968). Mathematical Structures of Language. New York, NY: Wiley.
Google Scholar
Hindle, D. (1990). Noun classification from predicate-argument structures. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Pittsburgh, PA, pp. 268–275.
Google Scholar
Jurafsky, D. and J. Martin (2000). Speech and Language Processing. Upper Saddle River, NJ: Prentice Hall.
Google Scholar
Klein, D. and C. Manning (2003). Accurate unlexicalized parsing. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Sapporo.
Google Scholar
Klein, D. and C. Manning (2004). Corpus-based induction of syntactic structure: models of dependency and constituency. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Barcelona.
Google Scholar
Levy, R. and C.D. Manning (2003). Is it harder to parse chinese, or the chinese treebank? In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2003, Sapporo, Hokkaido.
Google Scholar
Lin, D. (1995). A dependency-based method for evaluating broad-coverage parsers. In Proceedings of the International Joint Conference on Artificial Intelligence, Montreal, Quebec, pp. 1420–1425.
Google Scholar
Lin, D. (1998). Automatic retrieval and clustering of similar words. In Proceedings of the International Conference on Computational Linguistics and the Annual Meeting of the Association for Computational Linguistics, Montreal, Quebec, pp. 768–774.
Google Scholar
Manning, C. and H. Schutze (1999). Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press.
Google Scholar
McDonald, R., K. Crammer, and F. Pereira (2005). Online large-margin training of dependency parsers. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Ann Arbor, Michigan.
Google Scholar
Nivre, J. (2003). An efficient algorithm for projective dependency parsing. In Proceedings of the 8th International Workshop on Parsing Technologies, Nancy, pp. 149–160.
Google Scholar
Nivre, J., J. Hall, J. Nilsson, A. Chanev, G. Eryigit, S. kubler, S. Marinov, and E. Marsi (2007). Maltparser: a language-independent system for data-driven dependency parsing. Natural Language Engineering 13, 95–135.
Google Scholar
Pereira, F., N. Tishby, and L. Lee (1993). Distributional clustering of English words. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, pp. 183–190.
Google Scholar
Ratnaparkhi, A. (1999). Learning to parse natural language with maximum entropy models. Machine Learning 34(1–3), 151–175.
Article Google Scholar
Yamada, H. and Y. Matsumoto (2003). Statistical dependency analysis with support vector machines. In Proceedings of the International Workshop on Parsing Technologies, Nancy.
Google Scholar

Download references

Acknowledgements

We would like to thank Mark Steedman for suggesting the comparison with unlexicalised parsing in Section 7.6 and the anonymous reviewers for their useful comments. This work was supported by NSERC, the Alberta Ingenuity Center for Machine Learning, and the Canada Research Chairs program. The first author was also supported by iCORE Scholarship.

Author information

Authors and Affiliations

Yahoo! Inc., Santa Clara, CA, 95054, USA
Qin Iris Wang
Department of Computing Science, University of Alberta, Edmonton, AB, Canada, T6G 2E8
Dale Schuurmans
Google Inc., Mountain View, CA, 94043, USA
Dekang Lin

Authors

Qin Iris Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dale Schuurmans
View author publications
You can also search for this author in PubMed Google Scholar
Dekang Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qin Iris Wang .

Editor information

Editors and Affiliations

Tilburg University, Warandelaan 2, Tilburg, 5000 LE, Netherlands
Harry Bunt
Dépt. Linguistique, Université de Genève, rue de Candolle 2, Genève, 1211, Switzerland
Paola Merlo
Pimpstensvägen 16, Uppsala, 752 67, Sweden
Joakim Nivre

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Wang, Q.I., Schuurmans, D., Lin, D. (2010). Strictly Lexicalised Dependency Parsing. In: Bunt, H., Merlo, P., Nivre, J. (eds) Trends in Parsing Technology. Text, Speech and Language Technology, vol 43. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9352-3_7

Download citation

DOI: https://doi.org/10.1007/978-90-481-9352-3_7
Published: 29 September 2010
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-9351-6
Online ISBN: 978-90-481-9352-3
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)

Publish with us

Policies and ethics