Skip to main content

Electronic corpora of the Albanian, Kalmyk, Lezgian, and Ossetic Languages

Abstract

Four electronic corpora created in 2011 within the framework of the “Corpus Linguistics: the Albanian, Kalmyk, Lezgian, and Ossetic Languages” Program of Fundamental Research of the RAS are presented. The interface and functionalities of these corpora are described, engineering problems to be solved in their creation are elucidated, and the promises of their development are discussed. A particular emphasis is made on the compilation of dictionaries and automatic grammatical markup of the corpora.

This is a preview of subscription content, access via your institution.

References

  1. Savchuk, S.O., Metatekstovaya razmetka v natsional’nom korpuse russkogo yazyka: bazovye printsipy i osnovnye funktsii. Rezul’taty i perspektivy (Metatext Markup in Russian Language National Corpus: Basic Principles and Fundamental Functions. Results and Perspectives), Moscow: Indrik, 2005, pp. 62–88.

    Google Scholar 

  2. Daniel’, M.A., Levonyan, D.V., Plungyan, et a., East-Armenian National Corpus, Armyanskii Gumanitarnyi Vestnik 2009, no. 2/3-II, pp. 9–33.

  3. Zaliznyak, A.A., Grammaticheskii slovar’ russkogo yazyka: Slovoizmenenie (Grammatical Dictionary of Russian Language. Change of Words), Moscow: Russkie Slovari, 2007.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to T. A. Arkhangelskiy.

Additional information

Original Russian Text © T.A. Arkhangelskiy, 2012, published in Nauchno-Tekhnicheskaya Informatsiya, Seriya 2, 2012, No. 4, pp. 24–29.

About this article

Cite this article

Arkhangelskiy, T.A. Electronic corpora of the Albanian, Kalmyk, Lezgian, and Ossetic Languages. Autom. Doc. Math. Linguist. 46, 118–123 (2012). https://doi.org/10.3103/S0005105512020070

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S0005105512020070

Keywords

  • corpus
  • corpus linguistics
  • morphological parser
  • the Albanian language
  • the Kalmyk language
  • the Lezgian language
  • the Ossetic language