Advertisement

The Turkish Treebank

  • Gülşen Eryiğit
  • Kemal OflazerEmail author
  • Umut Sulubacak
Chapter
  • 387 Downloads
Part of the Theory and Applications of Natural Language Processing book series (NLP)

Abstract

In the last three decades, treebanks have become a crucial resource for building and evaluating natural language processing tools and applications. In this chapter, we review the essential aspects of the first treebank for Turkish that was built in early 2000s and its evolution and extensions since then.

Notes

Acknowledgements

The development of the Turkish Treebank and its extensions were supported by grants 199E026 and 112E276 from TÜBİTAK (Turkish Scientific and Technological Research Council) and by the ICT COST Action IC1207 PARSEME (PARSing and Multi-word Expressions).

References

  1. Adalı K, Dinç T, Gökırmak M, Eryiğit G (2016) Comprehensive annotation of multiword expressions for Turkish. In: Proceedings of TurCLing 2016, the first international conference on Turkic computational linguistics, Konya, pp 60–66Google Scholar
  2. Buchholz S, Marsi E (2006) CoNLL-X shared task on multilingual dependency parsing. In: Proceedings of CONLL, New York, NY, pp 149–164Google Scholar
  3. Çöltekin Ç (2016) (When) do we need inflectional groups? In: Proceedings of TurCLing 2016, the first international conference on Turkic computational linguistics, pp 38–43Google Scholar
  4. de Marneffe MC, MacCartney B, Manning C (2006) Generating typed dependency parses from phrase structure parses. In: Proceedings of LREC, Genoa, pp 449–454Google Scholar
  5. de Marneffe MC, Dozat T, Silveira N, Haverinen K, Ginter F, Nivre J, Manning CD (2014) Universal Stanford Dependencies: a cross-linguistic typology. In: Proceedings of LREC, Reykjavík, pp 4585–4592Google Scholar
  6. Erguvanlı EE (1979) The function of word order in Turkish grammar. PhD thesis, UCLA, Los Angeles, CAGoogle Scholar
  7. Eryiğit G (2007) ITU treebank annotation tool. In: Proceedings of the linguistic annotation workshop, Prague, pp 117–120Google Scholar
  8. Eryiğit G (2007) ITU validation set for METU-Sabancı Turkish Treebank. www.web.itu.edu.tr/gulsenc/papers/validationset.pdf. Accessed 14 Sept 2017
  9. Eryiğit G, Pamay T (2014) ITU validation set. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi 7(1):103–106Google Scholar
  10. Eryiğit G, İlbay T, Can OA (2011) Multiword expressions in statistical dependency parsing. In: Proceedings of the workshop on statistical parsing of morphologically rich languages, Dublin, pp 45–55Google Scholar
  11. Eryiğit G, Adalı K, Torunoğlu-Selamet D, Sulubacak U, Pamay T (2015) Annotation and extraction of multiword expressions in Turkish treebanks. In: Proceedings of the workshop on multiword expressions, Denver, CO, pp 70–76Google Scholar
  12. Hajič J (1998) Building a syntactically annotated corpus: the Prague Dependency Treebank. In: Hajicova E (ed) Issues in valency and meaning: studies in honour of Jarmila Panenova. Charles University Press, PragueGoogle Scholar
  13. Hakkani-Tür DZ, Oflazer K, Tür G (2002) Statistical morphological disambiguation for agglutinative languages. Comput Hum 36(4):381–410Google Scholar
  14. Lepage Y, Shin-Ichi A, Susumu A, Hitoshi I (1998) An annotated corpus in Japanese using Tesniere’s structural syntax. In: Proceedings of the workshop on the processing of dependency-based grammars, Montreal, pp 109–115Google Scholar
  15. Lin D (1998) A dependency-based method for evaluating broad-coverage parsers. Nat Lang Eng 4(02):97–114Google Scholar
  16. Marcus M, Marcinkiewicz M, Santorini B (1993) Building a large annotated corpus of English: the Penn Treebank. Comput Linguist 19(2):313–330Google Scholar
  17. Nilsson J, Riedel S, Yuret D (2007) The CoNLL 2007 shared task on dependency parsing. In: Proceedings of CoNLL, Prague, pp 915–932Google Scholar
  18. Oflazer K, Say B, Hakkani-Tür DZ, Tür G (2003) Building a Turkish Treebank. In: Treebanks: building and using parsed corpora. Kluwer Academic, BerlinGoogle Scholar
  19. Pamay T, Sulubacak U, Torunoğlu-Selamet D, Eryiğit G (2015) The annotation process of the ITU Web Treebank. In: Proceedings of the linguistic annotation workshop, Denver, CO, pp 95–101Google Scholar
  20. Petrov S, McDonald R (2012) Overview of the 2012 shared task on parsing the web. In: Notes of the first workshop on syntactic analysis of non-canonical languageGoogle Scholar
  21. Petrov S, Das D, McDonald R (2012) A universal part-of-speech tagset. In: Proceedings of LREC, Istanbul, pp 2089–2096Google Scholar
  22. Seddah D, Sagot B, Candito M, Mouilleron V, Combet V (2012) The French Social Media Bank: a treebank of noisy user generated content. In: Proceedings of COLING, Mumbai, pp 2441–2457Google Scholar
  23. Skut W, Krenn B, Brants T, Uszkoreit H (1997) An annotation scheme for free word order languages. In: Proceedings of the conference on applied natural language processing, Washington, DC, pp 88–95Google Scholar
  24. Sulubacak U, Eryiğit G (2013) Representation of morphosyntactic units and coordination structures in the Turkish dependency treebank. In: Proceedings of the workshop on statistical parsing of morphologically rich languages, Seattle, WA, pp 129–134Google Scholar
  25. Sulubacak U, Gökırmak M, Tyers F, Çöltekin Ç, Nivre J, Eryiğit G (2016a) Universal dependencies for Turkish. In: Proceedings of COLING, Osaka, pp 3444–3454Google Scholar
  26. Sulubacak U, Pamay T, Eryiğit G (2016b) IMST: a revisited Turkish dependency treebank. In: Proceedings of TurCLing 2016, the first international conference on Turkic computational linguistics, Konya, pp 1–6Google Scholar
  27. Tsarfaty R (2013) A unified morpho-syntactic scheme of Stanford Dependencies. In: Proceedings of ACL, Sofia, pp 578–584Google Scholar
  28. Zeman D (2008) Reusable tagset conversion using tagset drivers. In: Proceedings of LREC, Marrakesh, pp 213–218Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Gülşen Eryiğit
    • 1
  • Kemal Oflazer
    • 2
    Email author
  • Umut Sulubacak
    • 1
  1. 1.Istanbul Technical UniversityIstanbulTurkey
  2. 2.Carnegie Mellon University QatarDoha-Education CityQatar

Personalised recommendations