Abstract
The article deals with a comparative experimental study of methods of searching for significant keywords of Ukrainian-language content. The approach to the automatic definition of keywords is based on Porter’s stemming of words of the Ukrainian language for the Levenshtein distance, taking into account the possibility of using a thematic dictionary and the removal of blocked words. Experimental based on 100 scientific publications of technical direction compared to the author’s variants obtained numerous statistical characteristics of the accuracy of search results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Khomytska, I., Teslyuk, V.: Authorship and style attribution by statistical methods of style differentiation on the phonological level. In: Advances in Intelligent Systems and Computing III. AISC 871, pp. 105–118. Springer (2019)
Khomytska, I., Teslyuk, V., Holovatyy, A., Morushko, O.: Development of methods, models, and means for the author attribution of a text. Eastern-Eur. J. Enterpr. Technol. 3(2–93), 41–46 (2018)
Cherednichenko, O., Babkova, N., Kanishcheva, O.: Complex term identification for ukrainian medical texts. In: CEUR Workshop Proceedings, pp. 146–154 (2018)
Sharonova, N., Doroshenko, A., Cherednichenko, O.: Issues of fact-based information analysis. In: CEUR Workshop Proceedings, vol. 2136, pp. 11–19 (2018)
Bobicev, V., Kanishcheva, O., Cherednichenko, O.: Sentiment analysis in the Ukrainian and Russian news. In: First Ukraine Conference on Electrical and Computer Engineering, pp. 1050–1055 (2017)
Vysotska, V., Burov, Y., Lytvyn, V., Demchuk, A.: Defining author’s style for plagiarism detection in academic environment. In: Proceedings of the 2018 IEEE 2nd International Conference on Data Stream Mining and Processing, DSMP, pp. 128–133 (2018)
Lytvyn, V., Vysotska, V., Burov, Y., Bobyk, I., Ohirko, O.: The linguometric approach for co-authoring author’s style definition. In: Intelligent Data Acquisition and Advanced Computing Systems, IDAACS-SWS, pp. 29–34 (2018)
Lytvyn, V., Vysotska,V., Peleshchak, I., Basyuk, T., Kovalchuk, V., Kubinska, S., Chyrun, L., Rusyn, B., Pohreliuk, L., Salo, T.: Identifying textual content based on thematic analysis of similar texts in big data. In: International Scientific and Technical Conference on Computer Science and Information Nechnologies (CSIT), pp. 84–91 (2019)
Babichev, S.: An evaluation of the information technology of gene expression profiles processing stability for different levels of noise components. Data, 3(4), 48 (2018)
Babichev, S., Durnyak, B., Pikh, I., Senkivskyy, V.: An evaluation of the objective clustering inductive technology effectiveness implemented using density-based and agglomerative hierarchical clustering algorithms. In: Advances in Intelligent Systems and Computing, vol. 1020, pp. 532–553 (2020)
Senyk, M.: The Porter Stemming Algorithm for Ukrainian, http://www.senyk.poltava.ua, last accessed 2020/03/21
Vysotska, V., Lytvyn, V., Kovalchuk, V., Kubinska, S., Dilai, M., Rusyn, B., Pohreliuk, L., Chyrun, L., Chyrun, S., Brodyak, O.: Method of similar textual content selection based on thematic information retrieval. In: International Scientific and Technical Conference on Computer Science and Information Nechnologies (CSIT), pp. 1–6 (2019)
Vysotska, V., Fernandes, V.B., Lytvyn, V., Emmerich, M., Hrendus, M.: Method for determining linguometric coefficient dynamics of Ukrainian text content authorship. In: Advances in Intelligent Systems and Computing, vol. 871, pp. 132–151 (2019)
Lytvyn, V., Vysotska, V., Pukach, P., Nytrebych, Z., Demkiv, I., Senyk, A., Malanchuk, O., Sachenko, S., Kovalchuk, R., Huzyk, N.: Analysis of the developed quantitative method for automatic attribution of scientific and technical text content written in Ukrainian. Eastern-Eur. J. Enterp. Technol. 6(2–96), 19–31 (2018)
Vysotska, V., Lytvyn, V., Hrendus, M., Kubinska, S., Brodyak, O.: Method of textual information authorship analysis based on stylometry. In: 13th International Scientific and Technical Conference on Computer Sciences and Information Technologies, pp. 9–16 (2018)
Vysotska, V., Kanishcheva, O., Hlavcheva, Y.: Authorship identification of the scientific text in Ukrainian with using the lingvometry methods. In: Computer Sciences and Information Technologies, CSIT, pp. 34–38 (2018)
Kulchytskyi, I.: Statistical analysis of the short stories by roman ivanychuk. In: CEUR Workshop Proceedings, vol. 2362, pp. 312–321 (2019)
Shandruk, U.: Quantitative characteristics of key words in texts of scientific genre (on the Material of the Ukrainian scientific journal). In: CEUR Workshop Proceedings, vol. 2362, pp. 163–172 (2019)
Hardcoded stemmer for Ukrainian. https://github.com/vgrichina/ukrainian-stemmer. Accessed 21 Mar 2020
Lovins, J.B.: Development of a stemming algorithm. Mech. Transl. Comput. Linguist. 11, 22–31 (1968)
Jongejan, B., Dalianis, H.: Automatic training of lemmatization rules that handle morphological changes in pre-, in- and suffixes alike. http://www.aclweb.org/anthology/P/P09/P09-1017.pdf. Accessed 21 Mar 2020
Moseichuk, V.: Porter stemming algorithm for Ukrainian languages. http://www.marazm.org.ua/document/stemer_ua/. Accessed 21 Mar 2020
Perestoronin, P.: The Porter Stemming Algorithm for Russian. http://blog.eigene.in/post/49598738049/snowball. Accessed 21 Mar 2020
Porter stemmer. https://github.com/allaud/porter-stemmer. Accessed 21 Mar 2020
Porter, M.F.: An algorithm for suffix stripping. http://telemat.det.unifi.it/book/2001/wchange/download/stem_porter.html. Accessed 21 Mar 2020
Russian stemming algorithm. http://snowball.tartarus.org. Accessed 21 Mar 2020
The Porter Stemming Algorithm. http://tartarus.org/~martin/PorterStemmer/. Accessed 21 Mar 2020
Porter Stemming Algorithm. http://snowball.tartarus.org/algorithms/porter/stemmer.html. Accessed 21 Mar 2020
English stemming algorithm. http://snowball.tartarus.org/algorithms/english/stemmer.html. Accessed 21 Mar 2020
Willett, P.: The Porter stemming algorithm: then and now. http://eprints.whiterose.ac.uk/1434/. Accessed 21 Mar 2020
Khribi, M.K., Jemni, M., Nasraoui, O.: Automatic recommendations for e-learning personalization based on web usage mining techniques and information retrieval. In: International Conference on Advanced Learning Technologies, pp. 241–245 (2008)
Mobasher, B.: Data mining for web personalization. In: The Adaptive Web, pp. 90–135. Springer (2007)
Ferretti, S., Mirri, S., Prandi, C., Salomoni, P.: Automatic web content personalization through reinforcement learning. J. Syst. Softw. 121, 157–169 (2016)
Lavie, T., Sela, M., Oppenheim, I., Inbar, O., Meyer, J.: User attitudes towards news content personalization. Int. J. Hum.-Comput. Stud. 68(8), 483–495 (2010)
Fredrikson, M., Livshits, B. Repriv: Re-imagining content personalization and in-browser privacy. In: Symposium on Security and Privacy, pp. 131–146 (2011)
Chang, C.C., Chen, P.L., Chiu, F.R., Chen, Y.K.: Application of neural networks and Kano’s method to content recommendation in web personalization. Expert Syst. Appl. 36(3), 5310–5316 (2009)
Oliinyk, V.-A., Vysotska, V., Burov, Y., Mykich, K., Basto-Fernandes, V.: Propaganda detection in text data based on NLP and machine learning. In: CEUR Workshop Proceedings, vol. 2631, pp. 132–144 (2020)
Lynnyk, R., Vysotska,. V., Matseliukh, Y., Burov, Y., Demkiv, L., Zaverbnyj, A., Sachenko, A., Shylinska, I., Yevseyeva, I., Bihun, O.: DDOS attacks analysis based on machine learning in challenges of global changes. In: CEUR Workshop Proceedings, vol. 2631, pp. 159–171 (2020)
Anisimova, O., Vasylenko, V., Fedushko, S.: Social networks as a tool for a higher education institution image creation. In: CEUR Workshop Proceedings, vol. 2392, pp. 54–65 (2019)
Antonyuk, N., Medykovskyy, M., Chyrun, L., Dverii, M., Oborska, O., Krylyshyn, M., Vysotsky, A., Tsiura, N., Naum, O.: Online tourism system development for searching and planning trips with user’s requirements. In: Advances in Intelligent Systems and Computing IV, Springer Nature Switzerland AG 2020, vol. 1080, pp. 831–863 (2020)
Rzheuskyi, A., Kutyuk, O., Voloshyn, O., Kowalska-Styczen, A., Voloshyn, V., Chyrun, L., Chyrun, S., Peleshko, D., Rak, T.: The intellectual system development of distant competencies analyzing for IT recruitment. In: Advances in Intelligent Systems and Computing IV, vol. 1080, pp. 696–720. Springer, Cham (2020)
Antonyuk, N., Chyrun, L., Andrunyk, V., Vasevych, A., Chyrun, S., Gozhyj, A., Kalinina, I., Borzov, Y.: Medical news aggregation and ranking of taking into account the user needs. In: CEUR Workshop Proceedings, vol. 2362, pp. 369–382 (2019)
Chyrun, L., Chyrun, L., Kis, Y., Rybak, L.: Automated information system for connection to the access point with encryption WPA2 enterprise. In: Lecture Notes in Computational Intelligence and Decision Making, vol. 1020, pp. 389–404 (2020)
Kis, Y., Chyrun, L., Tsymbaliak, T., Chyrun, L.: Development of system for managers relationship management with customers. In: Lecture Notes in Computational Intelligence and Decision Making, vol. 1020, pp. 405–421 (2020)
Chyrun, L., Kowalska-Styczen, A., Burov, Y., Berko, A., Vasevych, A., Pelekh, I., Ryshkovets, Y.: Heterogeneous data with agreed content aggregation system development. In: CEUR Workshop Proceedings, vol. 2386, pp. 35–54 (2019)
Chyrun, L., Burov, Y., Rusyn, B., Pohreliuk, L., Oleshek, O., Gozhyj, A., Bobyk, I.: Web resource changes monitoring system development. In: CEUR Workshop Proceedings, vol. 2386, pp. 255–273 (2019)
Gozhyj, A., Chyrun, L., Kowalska-Styczen, A., Lozynska, O.: Uniform method of operative content management in web systems. In: CEUR Workshop Proceedings, vol. 2136, pp. 62–77 (2018)
Chyrun, L., Gozhyj, A., Yevseyeva, I., Dosyn, D., Tyhonov, V., Zakharchuk, M.: Web content monitoring system development. In: CEUR Workshop Proceedings, vol. 2362, pp. 126–142 (2019)
Bisikalo, O., Kontsevoi, A.: System for definition of indicator characteristics of social networks participants profiles. In: CEUR Workshop Proceedings, vol. 2604, pp. 77–88 (2020)
Kulchytskyy, I.: Quantitative parameters of some novellas by roman ivanychuk. In: CEUR Workshop Proceedings, vol. 2604, pp. 89–105 (2020)
Levchenko, O., Tyshchenko, O., Dilai, M.: Associative verbal network of the conceptual domain БIДA (MISERY) in Ukrainian. In: CEUR Workshop Proceedings, vol. 2604, pp. 106–120. (2020)
Vasyliuk, V., Shyika, Y., Shestakevych, T.: Information system of psycholinguistic text analysis. In: CEUR Workshop Proceedings, vol. 2604, pp. 178–188 (2020)
Khomytska, I., Teslyuk, V.: The multifactor method applied for authorship attribution on the phonological level. In: CEUR Workshop Proceedings, vol. 2604, pp. 189–198 (2020)
Albota, S.: Resolving conflict situations in reddit community driven discussion platform. In: CEUR Workshop Proceedings, vol. 2604, pp. 215–226 (2020)
Stasiuk, L.: Computer sampling and quantitative analysis in exploring secondary functions of questions in speech genres of intimate communication. In: CEUR Workshop Proceedings, vol. 2604, pp. 227–238 (2020)
Artemenko, O., Pasichnyk, V., Kunanets, N., Shunevych, K.: Using sentiment text analysis of user reviews in social media for e-tourism mobile recommender systems. In: CEUR Workshop Proceedings, vol. 2604, 259–271 (2020)
Bekesh, R., Chyrun, L., Kravets, P., Demchuk, A., Matseliukh, Y., Batiuk, T., Peleshchak, I., Bigun, R., Maiba, I.: Structural modeling of technical text analysis and synthesis processes. In: CEUR Workshop Proceedings, vol. 2604, pp. 562–589 (2020)
Chyrun, L.: Model of adaptive language synthesis based on cosine conversion furies with the use of continuous fractions. In: CEUR Workshop Proceedings, vol. 2604, pp. 600–611 (2020)
Husak, V., Lozynska, O., Karpov, I., Peleshchak, I., Chyrun, S., Vysotskyi, A.: Information system for recommendation list formation of clothes style image selection according to user’s needs based on NLP and Chatbots. In: CEUR Workshop Proceedings, vol. 2604, pp. 788–818 (2020)
Makara, S., Chyrun, L., Burov, Y., Rybchak, Z., Peleshchak, I., Peleshchak, R., Holoshchuk, R., Kubinska, S., Dmytriv, A.: An intelligent system for generating end-user symptom recommendations based on machine learning technology. In: CEUR Workshop Proceedings, vol. 2604, pp. 844–883 (2020)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Bisikalo, O., Vysotska, V., Lytvyn, V., Brodyak, O., Vyshemyrska, S., Rozov, Y. (2021). Experimental Investigation of Significant Keywords Search in Ukrainian Content. In: Shakhovska, N., Medykovskyy, M.O. (eds) Advances in Intelligent Systems and Computing V. CSIT 2020. Advances in Intelligent Systems and Computing, vol 1293. Springer, Cham. https://doi.org/10.1007/978-3-030-63270-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-63270-0_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63269-4
Online ISBN: 978-3-030-63270-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)