Abstract
The theme of this research is intelligent search engines that can search and extract new information from text data in Kazakh language and education. The significance of the research topic due to the growing amount of data represented in digital form, which provide the ability to access various sources of electronic documents. The use of intelligent search engines will allow you to meet the information needs of users. In this regard, the development of information-analytical search engines that allows you to work with data in Kazakh language is relevant. The goal of this research is to develop efficient algorithms and models for intelligent search systems, based on modern technologies in the field of information retrieval and natural language processing teaching them.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Information technology information search. http://inftis.narod.ru/is/is-n8.htm/. Accessed 10 Sept 2018
General principles of information retrieval in Internet. https://sites.google.com/site/gisciencepsu/in-the-news/. Accessed 10 Sept 2018
GOST 7.73-96: System of standards on information, librarianship and publishing. Search and dissemination of information. Terms and definitions, 13 p (1998)
Shokin, Y.I., Fedotov, A.M., Barakhnin, V.B.: Problems of information retrieval, 245 p. Nauka, Novosibirsk (2010)
Lukashevich N.: Thesauri in information retrieval tasks, 512 p. MGU publ., Moscow (2011)
Solton, J.: Dynamic library and information system, 160 p. Mir, Moscow (1979)
Mikhailov, A.I., Chernyi, A.I., Gilyarevskyi, R.S.: Scientific Communications and Informatics. Nauka, Moscow (1976). 312 p
Abiteboul, S., Buneman, P., Suciu, D.: Data on the web: from relations to semistructured data and XML, 260 R (2014). https://homepages.dcc.ufmg.br/~ laender/material/Data-on-the-Web-Skeleton.pdf. Accessed 10 May 2018
Buneman, P., Davidson, S., Fernandez, M., Suciu, D.: Adding structure to unstructured data. In: Afrati, F., Kolaitis, P. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 336–350. Springer, Heidelberg (1997). https://doi.org/10.1007/3-540-62222-5_55
Sint, R., Stroka, S., Schaffert S., Ferstl R.: Combining unstructured, fully structured and semi-structured information in semantic wikis. In: Proceedings of the Forth Semantic Wiki Workshop, Heraklion, Crete, Greece, pp. 56–60 (2009)
Masterman, M.: Semantic message detection for machine translation, using an interlingua. In: Proceedings of International Conference on Machine Translation, pp. 438–475 (1961)
Schrader, Y.A.: Quantitative characteristics of semantic data. STI.Ser.2, pp. 35–39 (1963)
ISO 25964–1:2011 Information and documentation – Thesauri and interoperability with other vocabularies. Thesauri for information retrieval, Part 1, 119 p (2011)
ISO 25964-2:2013 Information and documentation – Thesauri and interoperability with other vocabularies. Interoperability with other vocabularies. Part 2, 150 p (2013)
Tukeev, W.A., Turgunova, A.: Morphological analysis of the Kazakh language on the basis of a complete system of endings. In: Proceedings of International Conference on computational and cognitive linguistics (TEL-2016), Kazan, Republic of Tatarstan, pp. 225–231 (2016)
Wang, J., Guo Y.: Scrapy-based crawling and user-behavior characteristics analysis on taobao. In: 2012 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, pp. 44–52. IEEE (2012)
Myers, D., McGuffee, J.W.: Choosing scrapy. J. Comput. Sci. Coll. 31(1), 83–89 (2015)
Drakshayani, B., Prasad, E.V.: Semantic based model for text document clustering with idioms. Int. J. Data Eng. (IJDE) 4(1), 1–13 (2013)
Verma, R., Vuppuluri, V.: A new approach for idiom identification using meanings and the web. In: Proceedings of Recent Advances in Natural Language Processing, Hissar, Bulgaria, pp. 681–687 (2015)
Kenesbayev, S.K.: Phraseological dictionary of the Kazakh language, p. 711. Nauka, Alma-ATA (1977)
Vinogradov, V.V.: The main types of phraseological units in the Russian language. Selected works. Lexicology and lexicography, Moscow, 135 p (1977)
Fasttext. https://fasttext.cc/. Accessed 10 Sept 2018
GloVe. https://nlp.stanford.edu/projects/glove/: 12.09.2018
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space, 210 p. https://arxiv.org/pdf/1301.3781.pdf. Accessed 10 July 2018
Xin R.: word2vec parameter learning explained. https://arxiv.org/pdf/1411.2738.pdf. Accessed 10 July 2018
Kutuzov, A., Andreev, I.: Texts in that meaning out: neural language models in semantic similarity tasks for English. https://arxiv.org/ftp/arxiv/papers/1504/1504.08183.pdf. Accessed 20 Apr 2018
Kalimoldayev, M.N., Koibagarov, K.Ch., Alexandr, A., Pak, S., Zharmagambetov, A.: The application of the connectionist method of semantic similarity for Kazakh language. In: Twelve International Conference on Electronics Computer and Computation (ICECCO), pp. 1–3 (2015)
Word2Vec. https://ru.wikipedia.org/wiki/Word2vec. Accessed 15 Sept 2018
Algorithm of Word2vec. https://ru.megaindex.com/support/faq/word2vec. Accessed 15 Sept 2018
Webvectors. https://rusvectores.org/ru/about/. Accessed 15 Sept 2018
The Thesaurus. https://ru.wikipedia.org/. Accessed 15 Sept 2018
Balabaev Schwa: Kazakh tln synonymer szdg, 236 p. Mektep, Almaty (1975)
The Principle of Maximum Entropy. https://ru.wikipedia.org/wiki/. Accessed 07 Oct 2018
Rakhimova, D., Amirova, D., Karibayeva, A.: Problems of lexical polysemy for the Kazakh language. In: Mater. 3rd International scientific Confeence on “Informatics and applied mathematics” dedicated to Prof. The 80th anniversary of Professor R.G. Biyasheva and the 70th anniversary of Professor Aidarkhanova M.B., Almaty, vol. 2, pp. 18–28 (2018)
Translator. https://translate.google.kz/?hl=ru&tab=wT. Accessed 10 Mar 2019
Acknowledgments
This research performed and financed by the grant Project IRN AP05132950 “Development of an information-analytical search system of data in the Kazakh language”, awarded to The Republican State Enterprise (RGP) on the right of economic management (PVC) «Institute of Information and Computational Technologies».
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Diana, R., Assem, S. (2019). Problems of Semantics of Words of the Kazakh Language in the Information Retrieval. In: Nguyen, N., Chbeir, R., Exposito, E., Aniorté, P., Trawiński, B. (eds) Computational Collective Intelligence. ICCCI 2019. Lecture Notes in Computer Science(), vol 11684. Springer, Cham. https://doi.org/10.1007/978-3-030-28374-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-28374-2_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28373-5
Online ISBN: 978-3-030-28374-2
eBook Packages: Computer ScienceComputer Science (R0)