Multi-level Ontological Model of Big Data Processing
The paper presents a possible solution to the problems of structuring data of a large volume, as well as their integrated storage in structures that ensure the integrity, consistency of their presentation, high speed and flexibility of processing unstructured information. To solve mentioned problems, the authors propose a method for developing a multi-level ontological structure that provides a solution to interrelated problems of identifying, structuring and processing big data sets that has primarily natural-linguistic forms of representation. This multi-level model is developed based on methods of semantic analysis and relative modeling. The model is suitable for the interpretation and effective integrated processing of unstructured data obtained from distributed sources of information. The multilevel representation of the big data determines the methods and mechanisms of the unified meta-description of the data elements at the logical level, the search for patterns and classification of the characteristic space at the semantic level, and the linguistic level of the procedures for identifying, consolidating and enriching data. The modification of this method consists in applying a scalable and computationally effective genetic algorithm for searching and generating weight coefficients that correspond to different similarity measures for the set of observed features used in the data-clustering model.
KeywordsSemantic similarity Ontology Unstructured information Big data Semantic analysis Semantic meta-model Genetic algorithms
The study was performed by the grants from the RFBR (project № 18-07-00055 and project № 17-07-00446) in the Southern Federal University.
- 1.Bova, V.V., Kureichik, V.V., Leshchanov, D.V.: The model of semantic similarity estimation for the problems of big data search and structuring. In: 11th IEEE International Conference on Application of Information and Communication Technologies, AICT 2017, pp. 27–32 (2017)Google Scholar
- 2.Karpenko, A.P., Trudonoshin, V.A.: Multi-criteria estimation of the relevancy of documents in the enterprise ontological knowledge base using thematic clusterization. J. Sci. Educ. Bauman MSTU, 311–328 (2013)Google Scholar
- 3.Bova, V.V., Leshchanov, D.V.: The semantic search of knowledge in the environment of operation interdisciplinary information systems based on ontological approach. In: Proceedings of SFU. Technical Sciences, vol. 192, pp. 79–90. Publishing house TTI SFU, Taganrog (2017)Google Scholar
- 4.Rodzin, S., Rodzina, L.: Theory of bioinspired search for optimal solutions and its application for the processing of problem-oriented knowledge. In: 8th IEEE International Conference Application of Information and Communication Technologies, AICT 2014, pp. 142–147. IEEE Press, Astana (2014)Google Scholar
- 5.Aggarwal, C.C.: Data Clustering. In: Algorithms and Application, 648 p. CRC Press, Boca Raton (2014)Google Scholar
- 6.Cherezov, D.V., Tukachev, N.A.: Overview of the main methods of classification and clustering of data. In: Proceedings of VGU. System Analysis and Information Technologies, vol. 2, pp. 25–29 (2009)Google Scholar
- 7.Kravchenko, Y.A., Kuliev, E.V., Kursitys, I.O.: Information’s semantic search, classification, structuring and integration objectives in the knowledge management context problems. In: 10th IEEE International Conference on Application of Information and Communication Technologies, AICT 2016, pp. 136–141 (2016)Google Scholar
- 8.Zaporozhets, D.Y., Zaruba, D.V., Kureichik, V.V.: Hybrid bionic algorithms for solving problems of parametric optimization. World Appl. Sci. J. 23, 1032–1036 (2013)Google Scholar
- 9.Bova, V.V., Nuzhnov, E.V., Kureichik V.V.: The combined method of semantic similarity estimation of problem oriented knowledge on the basis of evolutionary procedures. In: Proceedings of the 6th Computer Science On-line Conference 2017 (CSOC 2017), Warsaw, Poland, vol. 1, pp. 74–83 (2017)Google Scholar
- 10.Bova, V.V., Kureichik, V.V., Leshchanov, D.V.: Model of semantic search in knowledge management systems based on genetic procedures. J. Inf. Technol. 12, 876–883 (2017)Google Scholar