Advertisement

Morphological Disambiguation for Turkish

  • Dilek Zeynep Hakkani-Tür
  • Murat Saraçlar
  • Gökhan Tür
  • Kemal OflazerEmail author
  • Deniz Yuret
Chapter
  • 380 Downloads
Part of the Theory and Applications of Natural Language Processing book series (NLP)

Abstract

Morphological disambiguation is the task of determining the contextually correct morphological parses of tokens in a sentence. A morphological disambiguator takes in a set of morphological parses for each token, generated by a morphological analyzer, and then selects a morphological parse for each, considering statistical and/or linguistic contextual information. This task can be seen as a generalization of the part-of-speech (POS) tagging problem, for morphologically rich languages. The disambiguated morphological analysis is usually crucial for further processing steps such as dependency parsing. In this chapter, we review the morphological disambiguation problem for Turkish and discuss approaches for solving this problem as they have evolved from manually crafted constraint-based rule systems to systems employing machine learning.

References

  1. Arslan BB (2009) An approach to the morphological disambiguation problem using conditional random fields. Master’s thesis, Sabancı University, IstanbulGoogle Scholar
  2. Bilmes JA, Kirchhoff K (2003) Factored language models and generalized parallel backoff. In: Proceedings of NAACL-HLT, Edmonton, pp 4–6Google Scholar
  3. Çetinoğlu Ö (2014) Turkish treebank as a gold standard for morphological disambiguation and its influence on parsing. In: Proceedings of LREC, Reykjavík, pp 3360–3365Google Scholar
  4. Charniak E, Hendrickson C, Jacobson N, Perkowitz M (1993) Equations for part-of-speech tagging. In: Proceedings of AAAI, Washington, DC, pp 784–789Google Scholar
  5. Collins M (2002) Discriminative training methods for Hidden Markov Models: theory and experiments with perceptron algorithms. In: Proceedings of EMNLP, Philadelphia, PA, pp 1–8Google Scholar
  6. Creutz M, Lagus K (2005) Unsupervised morpheme segmentation and morphology induction from text corpora using Morfessor 1.0. Publications in Computer and Information Science Report A81, Helsinki University of Technology, HelsinkiGoogle Scholar
  7. Ehsani R, Alper ME, Eryiğit G, Adalı E (2012) Disambiguating main POS tags for Turkish. In: Proceedings of the 24th conference on computational linguistics and speech processing, Chung-LiGoogle Scholar
  8. Eryiğit G (2012) The impact of automatic morphological analysis and disambiguation on dependency parsing of Turkish. In: Proceedings of LREC, IstanbulGoogle Scholar
  9. Eryiğit G, Pamay T (2014) ITU validation set. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi 7(1):103–106Google Scholar
  10. Görgün O, Yıldız OT (2011) A novel approach to morphological disambiguation for Turkish. In: Proceedings of ISCIS, London, pp 77–83Google Scholar
  11. Güngördü Z, Oflazer K (1995) Parsing Turkish using the Lexical-Functional Grammar formalism. Mach Transl 10(4):515–544Google Scholar
  12. Hakkani-Tür DZ, Oflazer K, Tür G (2002) Statistical morphological disambiguation for agglutinative languages. Comput Hum 36(4):381–410Google Scholar
  13. Kirchhoff K, Yang M (2005) Improved language modeling for statistical machine translation. In: Proceedings of the workshop on building and using parallel texts, Ann Arbor, MI, pp 125–128Google Scholar
  14. Kneissler J, Klakow D (2001) Speech recognition for huge vocabularies by using optimized sub-word units. In: Proceedings of INTERSPEECH, Aalborg, pp 69–72Google Scholar
  15. Lafferty JD, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of ICML, Williams, MA, pp 282–289Google Scholar
  16. Marcus M, Marcinkiewicz M, Santorini B (1993) Building a large annotated corpus of English: the Penn Treebank. Comput Linguist 19(2):313–330Google Scholar
  17. Oflazer K (2003) Dependency parsing with an extended finite-state approach. Comput Linguist 29(4):515–544Google Scholar
  18. Oflazer K, Kuruöz İ (1994) Tagging and morphological disambiguation of Turkish text. In: Proceedings of ANLP, Stuttgart, pp 144–149Google Scholar
  19. Oflazer K, Tür G (1996) Combining hand-crafted rules and unsupervised learning in constraint-based morphological disambiguation. In: Proceedings of EMNLP-VLC, Philadelphia, PAGoogle Scholar
  20. Oflazer K, Tür G (1997) Morphological disambiguation by voting constraints. In: Proceedings of ACL-EACL, Madrid, pp 222–229Google Scholar
  21. Oflazer K, Say B, Hakkani-Tür DZ, Tür G (2003) Building a Turkish Treebank. In: Treebanks: building and using parsed corpora. Kluwer Academic Publishers, BerlinGoogle Scholar
  22. Rivest R (1987) Learning decision lists. Mach Learn 2(3):229–246Google Scholar
  23. Sak H, Güngör T, Saraçlar M (2007) Morphological disambiguation of Turkish text with perceptron algorithm. In: Proceedings of CICLING, Mexico City, pp 107–118Google Scholar
  24. Sak H, Güngör T, Saraçlar M (2008) Turkish language resources: morphological parser, morphological disambiguator and web corpus. In: Proceedings of the 6th GoTAL conference, Gothenburg, pp 417–427Google Scholar
  25. Sak H, Güngör T, Saraçlar M (2011) Resources for Turkish morphological processing. Lang Resour Eval 45(2):249–261Google Scholar
  26. Yuret D, de la Maza M (2005) The greedy prepend algorithm for decision list induction. In: Proceedings of ISCIS, IstanbulGoogle Scholar
  27. Yuret D, Türe F (2006) Learning morphological disambiguation rules for Turkish. In: Proceedings of NAACL-HLT, New York, NY, pp 328–334Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Dilek Zeynep Hakkani-Tür
    • 1
  • Murat Saraçlar
    • 2
  • Gökhan Tür
    • 1
  • Kemal Oflazer
    • 3
    Email author
  • Deniz Yuret
    • 4
  1. 1.Google Inc.Mountain ViewUSA
  2. 2.Boğaziçi UniversityIstanbulTurkey
  3. 3.Carnegie Mellon University QatarDoha-Education CityQatar
  4. 4.Koç UniversityIstanbulTurkey

Personalised recommendations