Skip to main content

Strictly Lexicalised Dependency Parsing

  • Chapter
  • First Online:
Trends in Parsing Technology

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 43))

Abstract

There has been a great deal of progress in statistical parsing in the past decade (Collins, 1996, 1997; Charniak, 2000). A common characteristic of these previous generative parsers is their use of lexical statistics. However, it was subsequently discovered that bi-lexical statistics (parameters that involve two words) actually play a much smaller role than previously believed. It has been found by Gildea (2001) that the removal of bi-lexical statistics from a state-of-the-art PCFG parser resulted in little change in the output. Bikel (2004) observes that only 1.49% of the bi-lexical statistics needed in parsing were found in the training corpus. When considering only bigram statistics involved in the highest probability parse, this percentage becomes 28.8%. However, even when bi-lexical statistics do get used, they are remarkably similar to their back-off values using part-of-speech tags. Therefore, the utility of bi-lexical statistics becomes rather questionable. Klein and Manning (2003) present an unlexicalized parser that eliminates all lexical parameters, with a performance score close to the state-of-the-art lexicalised parsers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Although non-projective trees exist, the dependency trees used in our experiments are projective trees that are converted from the Penn Chinese Treebank.

  2. 2.

    We also computed directed dependency accuracy, which is defined as the percentage of words that have the correct head. We observed that the directed dependency accuracy is only slightly lower than the undirected one.

References

  • Bikel, D.M. (2004). Intricacies of Collins’ parsing model. Computational Linguistics 30(4), 479–511.

    Article  Google Scholar 

  • Bikel, D.M. and D. Chiang (2000). Two statistical parsing models applied to the chinese treebank. In Proceedings of the 2nd Chinese Language Processing Workshop, Hong Kong.

    Google Scholar 

  • Charniak, E. (2000). A maximum entropy inspired parser. In Proceedings of North American Annual Meeting of the Association for Computational Linguistics, Seattle, Washington, pp. 132–139.

    Google Scholar 

  • Clark, S., J. Hockenmaier, and M. Steedman (2002). Building deep dependency structures with a wide-coverage CCG parser. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA.

    Google Scholar 

  • Collins, M. (1996). A new statistical parser based on bigram lexical dependencies. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Santa Cruz, California, pp. 184–191.

    Google Scholar 

  • Collins, M. (1997). Three generative, lexicalized models for statistical parsing. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Madrid, pp. 16–23.

    Google Scholar 

  • Collins, M. (1999). Head-driven statistical models for natural language parsing. Ph. D. thesis, University of Pennsylvania, Pennsylvania, PA.

    Google Scholar 

  • Dagan, I., L. Lee, and F. Pereira (1999). Similarity-based models of word cooccurrence probabilities. Machine Learning 34(1–3), 43–69.

    Article  Google Scholar 

  • Eisner, J. (1996). Three new probabilistic models for dependency parsing: an exploration. In Proceedings of the International Conference on Computational Linguistics, Copenhagen.

    Google Scholar 

  • Eisner, J. and G. Satta (1999). Efficient parsing for bilexical context-free grammars and head- automaton grammars. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Maryland.

    Google Scholar 

  • Gildea, D. (2001). Corpus variation and parser performance. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Pittsburgh, PA.

    Google Scholar 

  • Graff, D. (2003). English Gigaword. Philadelphia, PA: Linguistic Data Consortium.

    Google Scholar 

  • Graff, D. and K. Chen (2003). Chinese Gigaword. Philadelphia, PA: Linguistic Data Consortium.

    Google Scholar 

  • Grefenstette, G. (1994). Corpus-derived first, second and third-order word affinities. In Proceedings of Euralex, Amsterdam.

    Google Scholar 

  • Harris, Z. (1968). Mathematical Structures of Language. New York, NY: Wiley.

    Google Scholar 

  • Hindle, D. (1990). Noun classification from predicate-argument structures. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Pittsburgh, PA, pp. 268–275.

    Google Scholar 

  • Jurafsky, D. and J. Martin (2000). Speech and Language Processing. Upper Saddle River, NJ: Prentice Hall.

    Google Scholar 

  • Klein, D. and C. Manning (2003). Accurate unlexicalized parsing. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Sapporo.

    Google Scholar 

  • Klein, D. and C. Manning (2004). Corpus-based induction of syntactic structure: models of dependency and constituency. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Barcelona.

    Google Scholar 

  • Levy, R. and C.D. Manning (2003). Is it harder to parse chinese, or the chinese treebank? In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2003, Sapporo, Hokkaido.

    Google Scholar 

  • Lin, D. (1995). A dependency-based method for evaluating broad-coverage parsers. In Proceedings of the International Joint Conference on Artificial Intelligence, Montreal, Quebec, pp. 1420–1425.

    Google Scholar 

  • Lin, D. (1998). Automatic retrieval and clustering of similar words. In Proceedings of the International Conference on Computational Linguistics and the Annual Meeting of the Association for Computational Linguistics, Montreal, Quebec, pp. 768–774.

    Google Scholar 

  • Manning, C. and H. Schutze (1999). Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press.

    Google Scholar 

  • McDonald, R., K. Crammer, and F. Pereira (2005). Online large-margin training of dependency parsers. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Ann Arbor, Michigan.

    Google Scholar 

  • Nivre, J. (2003). An efficient algorithm for projective dependency parsing. In Proceedings of the 8th International Workshop on Parsing Technologies, Nancy, pp. 149–160.

    Google Scholar 

  • Nivre, J., J. Hall, J. Nilsson, A. Chanev, G. Eryigit, S. kubler, S. Marinov, and E. Marsi (2007). Maltparser: a language-independent system for data-driven dependency parsing. Natural Language Engineering 13, 95–135.

    Google Scholar 

  • Pereira, F., N. Tishby, and L. Lee (1993). Distributional clustering of English words. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, pp. 183–190.

    Google Scholar 

  • Ratnaparkhi, A. (1999). Learning to parse natural language with maximum entropy models. Machine Learning 34(1–3), 151–175.

    Article  Google Scholar 

  • Yamada, H. and Y. Matsumoto (2003). Statistical dependency analysis with support vector machines. In Proceedings of the International Workshop on Parsing Technologies, Nancy.

    Google Scholar 

Download references

Acknowledgements

We would like to thank Mark Steedman for suggesting the comparison with unlexicalised parsing in Section 7.6 and the anonymous reviewers for their useful comments. This work was supported by NSERC, the Alberta Ingenuity Center for Machine Learning, and the Canada Research Chairs program. The first author was also supported by iCORE Scholarship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qin Iris Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media B.V.

About this chapter

Cite this chapter

Wang, Q.I., Schuurmans, D., Lin, D. (2010). Strictly Lexicalised Dependency Parsing. In: Bunt, H., Merlo, P., Nivre, J. (eds) Trends in Parsing Technology. Text, Speech and Language Technology, vol 43. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9352-3_7

Download citation

Publish with us

Policies and ethics