Skip to main content

Problems of Semantics of Words of the Kazakh Language in the Information Retrieval

  • Conference paper
  • First Online:
Computational Collective Intelligence (ICCCI 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11684))

Included in the following conference series:

Abstract

The theme of this research is intelligent search engines that can search and extract new information from text data in Kazakh language and education. The significance of the research topic due to the growing amount of data represented in digital form, which provide the ability to access various sources of electronic documents. The use of intelligent search engines will allow you to meet the information needs of users. In this regard, the development of information-analytical search engines that allows you to work with data in Kazakh language is relevant. The goal of this research is to develop efficient algorithms and models for intelligent search systems, based on modern technologies in the field of information retrieval and natural language processing teaching them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Information technology information search. http://inftis.narod.ru/is/is-n8.htm/. Accessed 10 Sept 2018

  2. General principles of information retrieval in Internet. https://sites.google.com/site/gisciencepsu/in-the-news/. Accessed 10 Sept 2018

  3. GOST 7.73-96: System of standards on information, librarianship and publishing. Search and dissemination of information. Terms and definitions, 13 p (1998)

    Google Scholar 

  4. Shokin, Y.I., Fedotov, A.M., Barakhnin, V.B.: Problems of information retrieval, 245 p. Nauka, Novosibirsk (2010)

    Google Scholar 

  5. Lukashevich N.: Thesauri in information retrieval tasks, 512 p. MGU publ., Moscow (2011)

    Google Scholar 

  6. Solton, J.: Dynamic library and information system, 160 p. Mir, Moscow (1979)

    Google Scholar 

  7. Mikhailov, A.I., Chernyi, A.I., Gilyarevskyi, R.S.: Scientific Communications and Informatics. Nauka, Moscow (1976). 312 p

    Google Scholar 

  8. Abiteboul, S., Buneman, P., Suciu, D.: Data on the web: from relations to semistructured data and XML, 260 R (2014). https://homepages.dcc.ufmg.br/~ laender/material/Data-on-the-Web-Skeleton.pdf. Accessed 10 May 2018

  9. Buneman, P., Davidson, S., Fernandez, M., Suciu, D.: Adding structure to unstructured data. In: Afrati, F., Kolaitis, P. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 336–350. Springer, Heidelberg (1997). https://doi.org/10.1007/3-540-62222-5_55

    Chapter  Google Scholar 

  10. Sint, R., Stroka, S., Schaffert S., Ferstl R.: Combining unstructured, fully structured and semi-structured information in semantic wikis. In: Proceedings of the Forth Semantic Wiki Workshop, Heraklion, Crete, Greece, pp. 56–60 (2009)

    Google Scholar 

  11. Masterman, M.: Semantic message detection for machine translation, using an interlingua. In: Proceedings of International Conference on Machine Translation, pp. 438–475 (1961)

    Google Scholar 

  12. Schrader, Y.A.: Quantitative characteristics of semantic data. STI.Ser.2, pp. 35–39 (1963)

    Google Scholar 

  13. ISO 25964–1:2011 Information and documentation – Thesauri and interoperability with other vocabularies. Thesauri for information retrieval, Part 1, 119 p (2011)

    Google Scholar 

  14. ISO 25964-2:2013 Information and documentation – Thesauri and interoperability with other vocabularies. Interoperability with other vocabularies. Part 2, 150 p (2013)

    Google Scholar 

  15. Tukeev, W.A., Turgunova, A.: Morphological analysis of the Kazakh language on the basis of a complete system of endings. In: Proceedings of International Conference on computational and cognitive linguistics (TEL-2016), Kazan, Republic of Tatarstan, pp. 225–231 (2016)

    Google Scholar 

  16. Wang, J., Guo Y.: Scrapy-based crawling and user-behavior characteristics analysis on taobao. In: 2012 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, pp. 44–52. IEEE (2012)

    Google Scholar 

  17. Myers, D., McGuffee, J.W.: Choosing scrapy. J. Comput. Sci. Coll. 31(1), 83–89 (2015)

    Google Scholar 

  18. Drakshayani, B., Prasad, E.V.: Semantic based model for text document clustering with idioms. Int. J. Data Eng. (IJDE) 4(1), 1–13 (2013)

    Google Scholar 

  19. Verma, R., Vuppuluri, V.: A new approach for idiom identification using meanings and the web. In: Proceedings of Recent Advances in Natural Language Processing, Hissar, Bulgaria, pp. 681–687 (2015)

    Google Scholar 

  20. Kenesbayev, S.K.: Phraseological dictionary of the Kazakh language, p. 711. Nauka, Alma-ATA (1977)

    Google Scholar 

  21. Vinogradov, V.V.: The main types of phraseological units in the Russian language. Selected works. Lexicology and lexicography, Moscow, 135 p (1977)

    Google Scholar 

  22. Fasttext. https://fasttext.cc/. Accessed 10 Sept 2018

  23. GloVe. https://nlp.stanford.edu/projects/glove/: 12.09.2018

  24. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)

    Book  Google Scholar 

  25. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space, 210 p. https://arxiv.org/pdf/1301.3781.pdf. Accessed 10 July 2018

  26. Xin R.: word2vec parameter learning explained. https://arxiv.org/pdf/1411.2738.pdf. Accessed 10 July 2018

  27. Kutuzov, A., Andreev, I.: Texts in that meaning out: neural language models in semantic similarity tasks for English. https://arxiv.org/ftp/arxiv/papers/1504/1504.08183.pdf. Accessed 20 Apr 2018

  28. Kalimoldayev, M.N., Koibagarov, K.Ch., Alexandr, A., Pak, S., Zharmagambetov, A.: The application of the connectionist method of semantic similarity for Kazakh language. In: Twelve International Conference on Electronics Computer and Computation (ICECCO), pp. 1–3 (2015)

    Google Scholar 

  29. Word2Vec. https://ru.wikipedia.org/wiki/Word2vec. Accessed 15 Sept 2018

  30. Algorithm of Word2vec. https://ru.megaindex.com/support/faq/word2vec. Accessed 15 Sept 2018

  31. Webvectors. https://rusvectores.org/ru/about/. Accessed 15 Sept 2018

  32. The Thesaurus. https://ru.wikipedia.org/. Accessed 15 Sept 2018

  33. Balabaev Schwa: Kazakh tln synonymer szdg, 236 p. Mektep, Almaty (1975)

    Google Scholar 

  34. The Principle of Maximum Entropy. https://ru.wikipedia.org/wiki/. Accessed 07 Oct 2018

  35. Rakhimova, D., Amirova, D., Karibayeva, A.: Problems of lexical polysemy for the Kazakh language. In: Mater. 3rd International scientific Confeence on “Informatics and applied mathematics” dedicated to Prof. The 80th anniversary of Professor R.G. Biyasheva and the 70th anniversary of Professor Aidarkhanova M.B., Almaty, vol. 2, pp. 18–28 (2018)

    Google Scholar 

  36. Translator. https://translate.google.kz/?hl=ru&tab=wT. Accessed 10 Mar 2019

Download references

Acknowledgments

This research performed and financed by the grant Project IRN AP05132950 “Development of an information-analytical search system of data in the Kazakh language”, awarded to The Republican State Enterprise (RGP) on the right of economic management (PVC) «Institute of Information and Computational Technologies».

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Rakhimova Diana or Shormakova Assem .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Diana, R., Assem, S. (2019). Problems of Semantics of Words of the Kazakh Language in the Information Retrieval. In: Nguyen, N., Chbeir, R., Exposito, E., Aniorté, P., Trawiński, B. (eds) Computational Collective Intelligence. ICCCI 2019. Lecture Notes in Computer Science(), vol 11684. Springer, Cham. https://doi.org/10.1007/978-3-030-28374-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-28374-2_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-28373-5

  • Online ISBN: 978-3-030-28374-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics