The Incremental Use of Morphological Information and Lexicalization in Data-Driven Dependency Parsing

  • Gülşen Eryiğit
  • Joakim Nivre
  • Kemal Oflazer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4285)

Abstract

Typological diversity among the natural languages of the world poses interesting challenges for the models and algorithms used in syntactic parsing. In this paper, we apply a data-driven dependency parser to Turkish, a language characterized by rich morphology and flexible constituent order, and study the effect of employing varying amounts of morpholexical information on parsing performance. The investigations show that accuracy can be improved by using representations based on inflectional groups rather than word forms, confirming earlier studies. In addition, lexicalization and the use of rich morphological features are found to have a positive effect. By combining all these techniques, we obtain the highest reported accuracy for parsing the Turkish Treebank.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Collins, M.: Head-Driven Statistical Models for Natural Language Parsing. PhD thesis, University of Pennsylvania (1999)Google Scholar
  2. 2.
    Collins, M., Hajic, J., Ramshaw, L., Tillmann, C.: A statistical parser for Czech. In: Proc. of ACL 1999, pp. 505–518 (1999)Google Scholar
  3. 3.
    Bikel, D., Chiang, D.: Two statistical parsing models applied to the Chinese treebank. In: Proc. of the Second Chinese Language Processing Workshop, pp. 1–6 (2000)Google Scholar
  4. 4.
    Dubey, A., Keller, F.: Probabilistic parsing for German using sister-head dependencies. In: Proc. of ACL 2003, pp. 96–103 (2003)Google Scholar
  5. 5.
    Levy, R., Manning, C.: Is it harder to parse Chinese, or the Chinese treebank? In: Proc. of ACL 2003, pp. 439–446 (2003)Google Scholar
  6. 6.
    Corazza, A., Lavelli, A., Satta, G., Zanoli, R.: Analyzing an Italian treebank with state-of-the-art statistical parsers. In: Proc. of the Third Workshop on Treebanks and Linguistic Theories (TLT), pp. 39–50 (2004)Google Scholar
  7. 7.
    Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proc. of ACL 2003, pp. 423–430 (2003)Google Scholar
  8. 8.
    Arun, A., Keller, F.: Lexicalization in crosslinguistic probabilistic parsing: The case of French. In: Proc. of ACL 2005, pp. 302–313 (2005)Google Scholar
  9. 9.
    Eryiğit, G., Oflazer, K.: Statistical dependency parsing of Turkish. In: Proc. of EACL 2006, pp. 89–96 (2006)Google Scholar
  10. 10.
    Kudo, T., Matsumoto, Y.: Japanese dependency analysis using cascaded chunking. In: Proc. of Conll 2002, pp. 63–69 (2002)Google Scholar
  11. 11.
    Yamada, H., Matsumoto, Y.: Statistical dependency analysis with support vector machines. In: Proc. of IWPT 2003, pp. 195–206 (2003)Google Scholar
  12. 12.
    Nivre, J., Scholz, M.: Deterministic dependency parsing of English text. In: Proc. of COLING 2004, pp. 64–70 (2004)Google Scholar
  13. 13.
    Nivre, J., Hall, J., Nilsson, J.: Memory-based dependency parsing. In: Proc. of Conll 2004, pp. 49–56 (2004)Google Scholar
  14. 14.
    Nivre, J., Nilsson, J.: Pseudo-projective dependency parsing. In: Proc. of the ACL 2005, pp. 99–106 (2005)Google Scholar
  15. 15.
    Bozşahin, C.: Gapping and word order in Turkish. In: Proc. of the 10th International Conference on Turkish Linguistics (2000)Google Scholar
  16. 16.
    Oflazer, K., Say, B., Hakkani-Tür, D.Z., Tür, G.: Building a Turkish treebank. In: Abeille, A. (ed.) Building and Exploiting Syntactically-annotated Corpora. Kluwer Academic Publishers, Dordrecht (2003)Google Scholar
  17. 17.
    Oflazer, K.: Dependency parsing with an extended finite-state approach. Computational Linguistics 29(4) (2003)Google Scholar
  18. 18.
    Nivre, J.: An efficient algorithm for projective dependency parsing. In: Proc. of IWPT 2003, pp. 149–160 (2003)Google Scholar
  19. 19.
    Black, E., Jelinek, F., Lafferty, J.D., Magerman, D.M., Mercer, R.L., Roukos, S.: Towards history-based grammars: Using richer models for probabilistic parsing. In: Proc. of the 5th DARPA Speech and Natural Language Workshop, pp. 31–37 (1992)Google Scholar
  20. 20.
    Veenstra, J., Daelemans, W.: A memory-based alternative for connectionist shift-reduce parsing. Technical Report ILK-0012, Tilburg University (2000)Google Scholar
  21. 21.
    Nivre, J.: Inductive Dependency Parsing. Springer, Heidelberg (2006)MATHCrossRefGoogle Scholar
  22. 22.
    Nivre, J.: Incrementality in deterministic dependency parsing. In: Keller, F., Clark, S., Crocker, M., Steedman, M. (eds.) Proc. of the Workshop on Incremental Parsing: Bringing Engineering and Cognition Together (ACL), pp. 50–57 (2004)Google Scholar
  23. 23.
    Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)MATHGoogle Scholar
  24. 24.
    Sagae, K., Lavie, A.: A classifier-based parser with linear run-time complexity. In: Proc. of IWPT 2005, pp. 125–132 (2005)Google Scholar
  25. 25.
    Chang, C.C., Lin, C.J.: LIBSVM: A Library for Support Vector Machines (2001), Software available at: http://www.csie.ntu.edu.tw/~cjlin/libsvm
  26. 26.
    Buchholz, S., Marsi, E., Krymolowski, Y., Dubey, A. (eds.): Proc. of the CoNLL-X Shared Task: Multi-lingual Dependency Parsing, New York, SIGNLL (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Gülşen Eryiğit
    • 1
  • Joakim Nivre
    • 2
  • Kemal Oflazer
    • 3
  1. 1.Department of Computer EngineeringIstanbul Technical Univ.Turkey
  2. 2.School of Mathematics and Systems EngineeringVäxjö Univ.Sweden
  3. 3.Faculty of Engineering and Natural SciencesSabancı Univ.Turkey

Personalised recommendations