Advertisement

Combining Diverse Word-Alignment Symmetrizations Improves Dependency Tree Projection

  • David Mareček
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6608)

Abstract

For many languages, we are not able to train any supervised parser, because there are no manually annotated data available. This problem can be solved by using a parallel corpus with English, parsing the English side, projecting the dependencies through word-alignment connections, and training a parser on the projected trees. In this paper, we introduce a simple algorithm using a combination of various word-alignment symmetrizations. We prove that our method outperforms previous work, even though it uses McDonald’s maximum-spanning-tree parser as it is, without any “unsupervised” modifications.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Klein, D., Manning, C.D.: Corpus-based induction of syntactic structure: Models of dependency and constituency. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, ACL 2004. Association for Computational Linguistics, Morristown (2004)Google Scholar
  2. 2.
    Koo, T., Carreras, X., Collins, M.: Simple semi-supervised dependency parsing. In: Proceedings of ACL/HLT (2008)Google Scholar
  3. 3.
    Hwa, R., Resnik, P., Weinberg, A., Kolak, O.: Evaluating Translational Correspondence using Annotation Projection. In: Proceedings of the 40th Annual Meeting of the ACL, pp. 392–399 (2002)Google Scholar
  4. 4.
    Hwa, R., Resnik, P., Weinberg, A., Cabezas, C., Kolak, O.: Bootstrapping Parsers via Syntactic Projection across Parallel Texts. Natural Language Engineering 11, 11–311 (2005)CrossRefGoogle Scholar
  5. 5.
    Smith, D.A., Eisner, J.: Parser adaptation and projection with quasi-synchronous grammar features. In: EMNLP 2009: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 822–831. Association for Computational Linguistics, Morristown (2009)Google Scholar
  6. 6.
    Ganchev, K., Gillenwater, J., Taskar, B.: Dependency grammar induction via bitext projection constraints. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2009, pp. 369–377. Association for Computational Linguistics, Morristown (2009)Google Scholar
  7. 7.
    Jiang, W., Liu, Q.: Dependency parsing and projection based on word-pair classification. In: ACL 2010: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 12–20. Association for Computational Linguistics, Morristown (2010)Google Scholar
  8. 8.
    Och, F.J., Ney, H.: A Systematic Comparison of Various Statistical Alignment Models. Computational Linguistics 29(1), 19–51 (2003)CrossRefzbMATHGoogle Scholar
  9. 9.
    Spoustová, D., Hajič, J., Votrubec, J., Krbec, P., Květoň, P.: The Best of Two Worlds: Cooperation of Statistical and Rule-Based Taggers for Czech. In: ACL 2007: Proceedings of the Workshop on Balto-Slavonic Natural Language Processing, pp. 67–74. Association for Computational Linguistics, Morristown (2007)Google Scholar
  10. 10.
    McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-Projective Dependency Parsing using Spanning Tree Algorithms. In: Proceedings of Human Langauge Technology Conference and Conference on Empirical Methods in Natural Language Processing (HTL/EMNLP), Vancouver, BC, Canada, pp. 523–530 (2005)Google Scholar
  11. 11.
    Schmid, H.: Probabilistic Part-of-Speech Tagging Using Decision Trees. In: Proceedings of the International Conference on New Methods in Language Processing, Manchester, UK, vol. 12, pp. 44–49 (1994)Google Scholar
  12. 12.
    Buchholz, S., Marsi, E.: CoNLL-X shared task on multilingual dependency parsing. In: Proceedings of The Tenth Conference on Natural Language Learning (CoNLL-X), New York City, USA, pp. 149–164 (2006)Google Scholar
  13. 13.
    Tiedemann, J.: Building a Multilingual Parallel Subtitle Corpus. In: Proceedings of CLIN (2007)Google Scholar
  14. 14.
    Yamada, H., Matsumoto, Y.: Statistical Dependency Analysis with Support Vector Machines. In: Proceedings of IWPT, pp. 195–206 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • David Mareček
    • 1
  1. 1.Institute of Formal and Applied LinguisticsCharles UniversityPragueCzech Republic

Personalised recommendations