Morphological Analysis System of the Tatar Language

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10449)

Abstract

This paper presents the description of the morphological analysis system for the Tatar Language based on a two-level morphology model. The morphological system is used for grammatical annotation of the Tatar national corpus. This paper shows the results of evaluation of completeness of the system using statistical information that was obtained from the corpus data and describes the ways to improve this system.

Keywords

Morphological analysis system HFST Alphabet Phonological rules The Tatar language Tatar National Corpus 

Notes

Acknowledgements

The reported study was funded by Russian Science Foundation, research Project № 16-18-02074.

References

  1. 1.
    Oflazer, K.: Two-level description of Turkish morphology. Lit. Linguist. Comput. 9(2), 137–148 (1994)CrossRefGoogle Scholar
  2. 2.
    Altintas K., Cicekli I.: A morphological analyzer for Crimean Tatar. In: Proceedings of the 10th Turkish Symposium on Artificial Intelligence and Neural Networks (TAINN’2001), pp. 180–189 (2001)Google Scholar
  3. 3.
    Çöltekin, Ç.: A set of open source tools for Turkish natural language processing. In: Calzolari, N. et al. (eds.) Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014), pp. 1079–1086 (2014)Google Scholar
  4. 4.
    Kessikbayeva, G., Cicekli, I.: Rule based morphological analyzer of Kazakh language. In: Proceedings of the 2014 Joint Meeting of SIGMORPHON and SIGFSM, pp. 46–54. ACL, Baltimore, June 2014Google Scholar
  5. 5.
    Tantug, C., Adali, E., Oflazer, K.: Computer analysis of the Turkmen language morphology. In: 5th International Conference on NLP (FinTAL 2006), Turku, pp. 186–193 (2006)CrossRefGoogle Scholar
  6. 6.
    Orhun, M., Tantug, C., Adali, E.: Rule based analysis of the Uyghur Nouns. Int. J. Asian Lang. Proc. 19(1), 33–44 (2009)Google Scholar
  7. 7.
    Suleymanov, D.S., Gilmullin, R.A.: Dvukhurovnevoye opisaniye morfologii tatarskogo yazyka [Two-level description of the Tatar language morphology]. In: Proceedings of “Language Semantics and Image of the World” International Scientific Conference, vol. 2, pp. 65–67. Kazan State University, Kazan (1997, in Russian)Google Scholar
  8. 8.
    Gökgöz, E., et al.: Two-level Qazan Tatar morphology. In: Proceedings of the 1-st International Conference on Foreign Language Teaching and Applied Linguistics, Sarajevo, 5–7 May 2011, pp. 428–432 (2011)Google Scholar
  9. 9.
    Davliyeva, A.R.: An investigation of Kazan Tatar morphology. Doctoral Dissertation, San Diego State University (2011)Google Scholar
  10. 10.
  11. 11.
    Tatar Nacional Corpus. http://tugantel.tatar/?lang=en
  12. 12.
    Socio-Political Corpus of the Tatar language. http://tugantel.tatar/corpus/op/
  13. 13.
    Lewis, M.P., Simons, G.F., Fennig, C.D. (eds.): Ethnologue: Languages of the World, 19th edn. SIL International, Dallas. Online version: http://www.ethnologue.com (2016)
  14. 14.
    Berment, V.: Méthodes pour informatiser des langues et des groups de langues peu dotées. Ph.D. Thesis, Joseph Fourier University, Grenoble I (2004)Google Scholar
  15. 15.
    Krauwer, S.: The basic language resource kit (BLARK) as the first milestone for the language resources roadmap. In: Proceedings of International Workshop Speech and Computer SPEECOM, Moscow, pp. 8–15 (2003)Google Scholar
  16. 16.
  17. 17.
    Suleymnov, D., Gatiatullin, A., Gilmullin, R.: Lexicograficheskaya baza dannykh dlya system mashinnogo perevoda blizkorodstvennykh yazykov. In: Proceedings of Third International Conference «Informatizatciya obschestva», pp. 585–587. Astana, Kazakhstan (2012)Google Scholar
  18. 18.
    Khusainov, A.F., Suleymanov, D.S.: Language identification system for the tatar language. In: Železný M., Habernal I., Ronzhin A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 203–210. Springer, Cham (2013). doi: 10.1007/978-3-319-01931-4_27CrossRefGoogle Scholar
  19. 19.
    Beesley, R.K., Karttunen, L.: Finite State Morphology. CSLI Publications, Stanford (2003)Google Scholar
  20. 20.
    Gilmullin, R.: Matematicheskoye modelirovaniye v mnogoyazykovykh sistemakh obrabotki dannykh na osnove avtomatov konechnykh sostoyaniy, pp. 48–94. Ph.D. Thesis, Kazan (2009)Google Scholar
  21. 21.
    Sokolov, A., Egorov, A., Gubanov, S., Khrystich, D., Shmatova, M., Galinskaya, I., Baytin, A.: Eksperimental’naya versiya tatarsko-russkogo statisticheskogo mashinnogo perevoda. In: Proceedings of the International Conference “Turkic Language Processing: Turklang-2015”, pp. 67–76. Academy of Science of the Republic of Tatarstan Press, Kazan (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Institute of Applied SemioticsTatarstan Academy of SciencesKazanRussia

Personalised recommendations