A Latent Variable Model for Generative Dependency Parsing

Titov, Ivan; Henderson, James

doi:10.1007/978-90-481-9352-3_3

Ivan Titov⁴ &
James Henderson⁵

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 43))

641 Accesses
1 Citations

Abstract

Dependency parsing has been a topic of active research in natural language processing during the last several years. The CoNLL-2006 shared task (Buchholz and Marsi, 2006) made a wide selection of standardized treebanks for different languages available for the research community and allowed for easy comparison between various statistical methods on a standardized benchmark. One of the surprising things discovered by this evaluation is that the best results are achieved by methods which are quite different from state-of-the-art models for constituent parsing, e.g. the deterministic parsing method of Nivre et al. (2006) and the minimum spanning tree parser of McDonald et al.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In preliminary experiments, we also considered look-ahead, where the word is predicted earlier than it appears at the head of the queue I, and “anti-look-ahead”, where the word is predicted only when it is shifted to the stack S. Early prediction allows conditioning decision probabilities on the words in the look-ahead and, thus, speeds up the search for an optimal decision sequence. However, the loss of accuracy with look-ahead was quite significant. The described method, where a new word is predicted when it appears at the head of the queue, led to the most accurate model and quite efficient search. The anti-look-ahead model was both less accurate and slower.
2.
We refer to the head of the queue as the front, to avoid unnecessary ambiguity of the word head in the context of dependency parsing.
3.
The tuned feature sets were obtained from http://w3.msi.vxu.se/~nivre/research/MaltParser.html. We removed lookahead features for ISBN experiments but preserved them for experiments with the MALT parser. Analogously, we extended simple features with 3 words lookahead for the MALT parser experiments.
4.
Part-of-speech tags for multi-word units in the Dutch treebank were formed as concatenation of tags of the words, which led to quite a sparse set of part-of-speech tags.
5.
Note that the development set accuracy predicted correctly the testing set ranking of ISBN TF, LF and TF-NA models on each of the datasets, so it is fair to compare the best ISBN result among the three with other parsers.
6.
The MALT parser is trained to keep the word as long as possible: if both Shift and Reduce decisions are possible during training, it always prefers to shift. Though this strategy should generally reduce the described problem, it is evident from the low precision score for attachment to root that it can not completely eliminate it.
7.
Use of cross-validation with our model is relatively time-consuming and, thus, not quite feasible for the shared task.
8.
A piecewise-linear approximation for each individual language was used to compute the average. Experiments were run on a standard 2.4 GHz desktop PC.
9.
For Basque, Chinese, and Turkish this time is below 7 ms, but for English it is 38 ms. English, along with Catalan, required the largest beam across all ten languages. Note that accuracy in the lowest part of the curve can probably be improved by varying latent vector size and frequency cut-offs. Also, efficiency was not the main goal during the implementation of the parser, and it is likely that a much faster implementation is possible.
10.
The ISBN dependency parser is downloadable from http://flake.cs.uiuc.edu/titov/idp/

References

Abeillé, A. (Ed.) (2003). Treebanks: Building and Using Parsed Corpora. Dordrecht: Kluwer.
Google Scholar
Aduriz, I., M. J. Aranzabe, J. M. Arriola, A. Atutxa, A. D. de Ilarraza, A. Garmendia, and M. Oronoz (2003). Construction of a Basque dependency treebank. In Proceedings of the 2nd Workshop on Treebanks and Linguistic Theories (TLT), Växjö, pp. 201–204.
Google Scholar
Aho, A. V., R. Sethi, and J. D. Ullman (1986). Compilers: Principles, Techniques and Tools. Reading, MA: Addison Wesley.
Google Scholar
Böhmová, A., J. Hajič, E. Hajičová, and B. Hladká (2003). The PDT: a 3-level annotation scenario. See Abeillé (2003), Chapter 7, pp. 103–127.
Bottou, L. (1991). Une approche théoretique de l’apprentissage connexionniste: Applications à la reconnaissance de la parole. Ph. D. thesis, Université de Paris XI, Paris.
Google Scholar
Buchholz, S. and E. Marsi (2006). CoNLL-X shared task on multilingual dependency parsing. In Proceedings of the 10th Conference on Computational Natural Language Learning, New York, NY, pp. 149–164.
Google Scholar
Charniak, E. (2000). A maximum-entropy-inspired parser. In Proceedings of the 1st Meeting of North American Chapter of Association for Computational Linguistics, Seattle, WA, pp. 132–139.
Google Scholar
Charniak, E. and M. Johnson (2005). Coarse-to-fine n-best parsing and MaxEnt discriminative reranking. In Proceedings of the 43rd Meeting of Association for Computational Linguistics, Ann Arbor, MI, pp. 173–180.
Google Scholar
Chen, K., C. Luo, M. Chang, F. Chen, C. Chen, C. Huang, and Z. Gao (2003). Sinica treebank: design criteria, representational issues and implementation. See Abeillé (2003), Chapter 13, pp. 231–248.
Collins, M. (1999). Head-Driven Statistical Models for Natural Language Parsing. Ph. D. thesis, University of Pennsylvania, Philadelphia, PA.
Google Scholar
Collins, M. (2000). Discriminative reranking for natural language parsing. In Proceedings of the 17th International Conference on Machine Learning, Stanford, CA, pp. 175–182.
Google Scholar
Csendes, D., J. Csirik, T. Gyimóthy, and A. Kocsor (2005). The Szeged Treebank. Springer, Berlin/ Heidelberg.
Google Scholar
Dzeroski, S., T. Erjavec, N. Ledinek, P. Pajas, Z. Zabokrtsky, and A. Zele (2006). Towards a Slovene dependency treebank. In Proceedings of the International Conference on Language Resources and Evaluation (LREC), Genoa, pp. 1388–1391.
Google Scholar
Hajič, J., O. Smrž, P. Zemánek, J. Šnaidauf, and E. Beška (2004). Prague Arabic dependency treebank: development in data and tools. In Proceedings of the NEMLAR International Conference on Arabic Language Resources and Tools, Caira, pp. 110–117.
Google Scholar
Hall, J., J. Nilsson, J. Nivre, G. Eryigit, B. Megyesi, M. Nilsson, and M. Saers (2007). Single malt or blended? a study in multilingual parser optimization. In Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, Prague, pp. 933–939.
Google Scholar
Henderson, J. (2003). Inducing history representations for broad coverage statistical parsing. In Proceedings of the Joint Meeting of North American Chapter of the Association for Computational Linguistics and the Human Language Technology Conference, Edmonton, AB, pp. 103–110.
Google Scholar
Henderson, J. (2004). Discriminative training of a neural network statistical parser. In Proceedings of the 42nd Meeting of Association for Computational Linguistics, Barcelona, pp. 95–102.
Google Scholar
Henderson, J., P. Merlo, G. Musillo, and I. Titov (2008). A latent variable model of synchronous parsing for syntactic and semantic dependencies. In Proceedings of the CoNLL-2008 Shared Task, Manchester, pp. 178–182.
Google Scholar
Henderson, J. and I. Titov (2005). Data-defined kernels for parse reranking derived from probabilistic models. In Proceedings of the 43rd Meeting of Association for Computational Linguistics, Ann Arbor, MI, pp. 181–188.
Google Scholar
Johansson, R. and P. Nugues (2007). Extended constituent-to-dependency conversion for English. In Proceedings of the 16th Nordic Conference on Computational Linguistics (NODALIDA), Tartu, pp. 105–112.
Google Scholar
Jordan, M. I., Z. Ghahramani, T. S. Jaakkola, and L. K. Saul. (1999). An introduction to variational methods for graphical models. In M. I. Jordan (Ed.), Learning in Graphical Models, pp. 183–233. Cambridge, MA: MIT Press.
Google Scholar
Koo, T. and M. Collins (2005). Hidden-variable models for discriminative reranking. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Vancouver, BC, pp. 507–514.
Google Scholar
Kromann, M. T. (2003). The Danish dependency treebank and the underlying linguistic theory. In Proceedings of the 2nd Workshop on Treebanks and Linguistic Theories (TLT), Vaxjo.
Google Scholar
Liang, P., S. Petrov, M. Jordan, and D. Klein (2007). The infinite PCFG using hierarchical dirichlet processes. In Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, pp. 688–697.
Google Scholar
Marcus, M., B. Santorini, and M. Marcinkiewicz (1993). Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics 19(2), 313–330.
Google Scholar
Martí, M. A., M. Taulé, L. Màrquez, and M. Bertran (2007). CESS-ECE: a multilingual and multilevel annotated corpus. Available for download from: http://www.lsi.upc.edu/ mbertran/cess-ece/
Matsuzaki, T., Y. Miyao, and J. Tsujii (2005). Probabilistic CFG with latent annotations. In Proceedings of the 43rd Annual Meeting of the ACL, Ann Arbor, MI, pp. 75–82.
Google Scholar
McDonald, R., K. Lerman, and F. Pereira (2006). Multilingual dependency analysis with a two-stage discriminative parser. In Proceeding of the 10th Conference on Computational Natural Language Learning, New York, NY, pp. 216–220.
Google Scholar
Montemagni, S., F. Barsotti, M. Battista, N. Calzolari, O. Corazzari, A. Lenci, A. Zampolli, F. Fanciulli, M. Massetani, R. Raffaelli, R. Basili, M. T. Pazienza, D. Saracino, F. Zanzotto, N. Nana, F. Pianesi, and R. Delmonte (2003). Building the Italian Syntactic-Semantic Treebank. See Abeillé (2003), Chapter 11, pp. 189–210.
Murphy, K. P. (2002). Dynamic Belief Networks: Representation, Inference and Learning. Ph. D. thesis, University of California, Berkeley, CA.
Google Scholar
Musillo, G. and P. Merlo (2008). Unlexicalised hidden variable models of split dependency grammars. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Columbus, OH, pp. 213–216.
Google Scholar
Nakagawa, T. (2007). Multilingual dependency parsing using global features. In Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, pp. 952–956.
Google Scholar
Neal, R. (1992). Connectionist learning of belief networks. Artificial Intelligence 56, 71–113.
Article Google Scholar
Nivre, J., J. Hall, and J. Nilsson (2004). Memory-based dependency parsing. In Proceedings of the 8th Conference on Computational Natural Language Learning, Boston, MA, pp. 49–56.
Google Scholar
Nivre, J., J. Hall, J. Nilsson, G. Eryigit, and S. Marinov (2006). Pseudo-projective dependency parsing with support vector machines. In Proceedings of the 10th Conference on Computational Natural Language Learning, New York, NY, pp. 221–225.
Google Scholar
Oflazer, K., B. Say, D. Z. Hakkani-Tür, and G. Tür (2003). Building a Turkish treebank. See Abeillé (2003), Chapter 15, pp. 261–277.
Peshkin, L. and V. Savova (2005). Dependency parsing with dynamic Bayesian network. In AAAI, 20th National Conference on Artificial Intelligence, Pittsburgh, PA, pp. 1112–1117.
Google Scholar
Petrov, S., L. Barrett, R. Thibaux, and D. Klein (2006). Learning accurate, compact, and interpretable tree annotation. In Proceedings of the Annual Meeting of the ACL and the International Conference on Computational Linguistics, Sydney, pp. 433–44.
Google Scholar
Petrov, S. and D. Klein (2007). Improved inference for unlexicalized parsing. In Proceedings of the Conference on Human Language Technology and North American chapter of the Association for Computational Linguistics (HLT-NAACL 2007), Rochester, NY, pp. 404–411.
Google Scholar
Prescher, D. (2005). Head-driven PCFGs with latent-head statistics. In Proceedings of the 9th International Workshop on Parsing Technologies, Vancouver, BC, pp. 115–124.
Google Scholar
Prokopidis, P., E. Desypri, M. Koutsombogera, H. Papageorgiou, and S. Piperidis (2005). Theoretical and practical issues in the construction of a Greek dependency treebank. In Proceedings of the 4th Workshop on Treebanks and Linguistic Theories (TLT), Barcelona, pp. 149–160.
Google Scholar
Riezler, S., T. H. King, R. M. Kaplan, R. Crouch, J. T. Maxwell, and M. Johnson (2002). Parsing the Wall Street Journal using a Lexical-Functional Grammar and discriminative estimation techniques. In Proceedings of the 40th Meeting of Association for Computational Linguistics, Philadelphia, PA, pp. 271–278.
Google Scholar
Sallans, B. (2002). Reinforcement Learning for Factored Markov Decision Processes. Ph. D. thesis, University of Toronto, Toronto, ON.
Google Scholar
Sha, F. and F. Pereira (2003). Shallow parsing with conditional random fields. In Proceedings of the Joint Meeting of North American Chapter of the Association for Computational Linguistics and the Human Language Technology Conference, Edmonton, AB, pp. 213–220.
Google Scholar
Titov, I. and J. Henderson (2007). Constituent parsing with Incremental Sigmoid Belief Networks. In Proc. 45th Meeting of Association for Computational Linguistics (ACL), Prague, pp. 632–639.
Google Scholar
van der Beek, L., G. Bouma, J. Daciuk, T. Gaustad, R. Malouf, G. van Noord, R. Prins, and B. Villada (2002). The Alpino dependency treebank. In Computational Linguistic in the Netherlands (CLIN), Enschede, pp. 8–22.
Google Scholar

Download references

Acknowledgements

This work was funded by Swiss NSF grant 200020-109685, Swiss NSF Fellowship PBGE22-119276, UK EPSRC grant EP/E019501/1, EU FP6 grant 507802 (TALK project), and EU FP7 grant 216594 (CLASSiC project).

Author information

Authors and Affiliations

Cluster of Excellence, MMC, Saarland University, Saarbrücken, Germany
Ivan Titov
Department of Computer Science, University of Geneva, Geneva, Switzerland
James Henderson

Authors

Ivan Titov
View author publications
You can also search for this author in PubMed Google Scholar
James Henderson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ivan Titov .

Editor information

Editors and Affiliations

Tilburg University, Warandelaan 2, Tilburg, 5000 LE, Netherlands
Harry Bunt
Dépt. Linguistique, Université de Genève, rue de Candolle 2, Genève, 1211, Switzerland
Paola Merlo
Pimpstensvägen 16, Uppsala, 752 67, Sweden
Joakim Nivre

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Titov, I., Henderson, J. (2010). A Latent Variable Model for Generative Dependency Parsing. In: Bunt, H., Merlo, P., Nivre, J. (eds) Trends in Parsing Technology. Text, Speech and Language Technology, vol 43. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9352-3_3

Download citation

DOI: https://doi.org/10.1007/978-90-481-9352-3_3
Published: 29 September 2010
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-9351-6
Online ISBN: 978-90-481-9352-3
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)

Publish with us

Policies and ethics