Advertisement

Abstract

Universal Dependencies is a recent initiative to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. In this paper, I outline the motivation behind the initiative and explain how the basic design principles follow from these requirements. I then discuss the different components of the annotation standard, including principles for word segmentation, morphological annotation, and syntactic annotation. I conclude with some thoughts on the challenges that lie ahead.

Keywords

Natural Language Processing Content Word Function Word Computational Linguistics Word Segmentation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Nolan, E., Hirsch, S. (eds.): The Greek Grammar of Roger Bacon and a Fragment of his Hebrew Grammar. Cambridge University Press (1902)Google Scholar
  2. 2.
    Brekle, H.E., Lancelot, C., Arnauld, A.: Grammaire générale et raisonnée, ou La Grammaire de Port-Royal. Friedrich Frommann Verlag (1966)Google Scholar
  3. 3.
    Chomsky, N.: Aspects of the Theory of Syntax. MIT Press (1965)Google Scholar
  4. 4.
    Chomsky, N.: Cartesian Linguistics. Harper and Row (1965)Google Scholar
  5. 5.
    Tsarfaty, R., Seddah, D., Goldberg, Y., Kuebler, S., Versley, Y., Candito, M., Foster, J., Rehbein, I., Tounsi, L.: Statistical parsing of morphologically rich languages (spmrl) what, how and whither. In: Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages, pp. 1–12 (2010)Google Scholar
  6. 6.
    Tsarfaty, R.: A unified morpho-syntactic scheme of Stanford dependencies. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 578–584 (2013)Google Scholar
  7. 7.
    Naseem, T., Barzilay, R., Globerson, A.: Selective sharing for multilingual dependency parsing. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 629–637 (2012)Google Scholar
  8. 8.
    Täckström, O., McDonald, R., Nivre, J.: Target language adaptation of discriminative transfer parsers. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT), pp. 1061–1071 (2013)Google Scholar
  9. 9.
    Zeman, D.: Reusable tagset conversion using tagset drivers. In: Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC), pp. 213–218 (2008)Google Scholar
  10. 10.
    Petrov, S., Das, D., McDonald, R.: A universal part-of-speech tagset. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC) (2012)Google Scholar
  11. 11.
    Zeman, D., Mareček, D., Popel, M., Ramasamy, L., Štěpánek, J., Žabokrtský, Z., Hajič, J.: HamleDT: To parse or not to parse? In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC), pp. 2735–2741 (2012)Google Scholar
  12. 12.
    McDonald, R., Nivre, J., Quirmbach-Brundage, Y., Goldberg, Y., Das, D., Ganchev, K., Hall, K., Petrov, S., Zhang, H., Täckström, O., Bedini, C., Bertomeu Castelló, N., Lee, J.: Universal dependency annotation for multilingual parsing. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 92–97 (2013)Google Scholar
  13. 13.
    de Marneffe, M.C., Dozat, T., Silveira, N., Haverinen, K., Ginter, F., Nivre, J., Manning, C.D.: Universal Stanford Dependencies: A cross-linguistic typology. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC), pp. 4585–4592 (2014)Google Scholar
  14. 14.
    Butt, M., Dyvik, H., Holloway King, T., Masuichi, H., Rohrer, C.: The parallel grammar project. In: Proceedings of the Workshop on Grammar Engineering and Evaluation at the 19th International Conference on Computational Linguistics, pp. 1–7 (2002)Google Scholar
  15. 15.
    Bender, E.M., Flickinger, D., Oepen, S.: The grammar matrix: An open-source starter-kit for the rapid development of cross-linguistically consistent broad-coverage precision grammars. In: Proceedings of the Workshop on Grammar Engineering and Evaluation at the 19th International Conference on Computational Linguistics, pp. 8–14 (2002)Google Scholar
  16. 16.
    Buchholz, S., Marsi, E.: CoNLL-X shared task on multilingual dependency parsing. In: Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL), pp. 149–164 (2006)Google Scholar
  17. 17.
    Blevins, J.P.: Word-based morphology. Journal of Linguistics 42, 531–573 (2006)CrossRefGoogle Scholar
  18. 18.
    Mel’čuk, I.: Dependency Syntax: Theory and Practice. State University of New York Press (1988)Google Scholar
  19. 19.
    McDonald, R., Nivre, J.: Characterizing the errors of data-driven dependency parsing models. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 122–131 (2007)Google Scholar
  20. 20.
    Tesnière, L.: Éléments de syntaxe structurale. Editions Klincksieck (1959)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Linguistics and PhilologyUppsala UniversityUppsalaSweden

Personalised recommendations