Abstract
The article is devoted to the scientific school developed by the first author in 1995–2012 in Yaroslav-the-Wise Novgorod State University (Veliky Novgorod, Russia). The finite practical goal of the research carried out by the school can be denoted here as the revelation of the most rational variant for sense transfer in a knowledge unit defined by a set of semantically equivalent natural-language phrases. One phrase here corresponds to the simple spread natural-language sentence (according to the “Meaning–Text” theory terminology). Knowledge formed herewith about synonymy and forms of language expression of relationships between concepts of some topical area are in demand in tasks requiring the establishment of full or partial equivalence in the meaning of both complete sentences of natural language and their combinations, and individual fragments of phrases. The results are both theoretical and practical in nature. Offered methods and their software implementations can be used for decision of a wide range of tasks of recognition and analysis of semantics of complex information objects (texts and images at first), and for lossless-in-sense information compression.
REFERENCES
Antiplagiat. https://www.antiplagiat.ru. Cited July 8, 2022.
G. M. Emelyanov, A. N. Kornyshov, and D. V. Mikhailov, “Conceptual-situational modeling of the process of rephrasing of natural language statements as precedent-based learning,” Nauchn.-Teoreticheskii Zh. Iskusstvennyi intellekt 2, 72–75 (2006).
G. M. Emelyanov, T. V. Krechetova, and E. P. Kurashova, “Semantic analysis in computer-aided systems of speech understanding,” Pattern Recognit. Image Anal. 8, 408–410 (1998).
G. M. Emelyanov, T. V. Krechetova, and E. P. Kurashova, “Tree grammars in the problems of searching for images by their verbal descriptions,” Pattern Recognit. Image Anal. 10, 520–526 (2000).
G. M. Emelyanov and D. V. Mikhailov, “Sense standards, recognition of textual information and its compression based on knowledge of synonymy,” Pattern Recognit. Image Anal. 24, 63–72 (2014). https://doi.org/10.1134/s1054661814010118
G. M. Emel’yanov and D. V. Mikhailov, “Clusterization of semantic meanings in the problem of sense equivalence situation recognition,” Pattern Recognit. Image Anal. 19, 92–102 (2009).
G. M. Emel’yanov, D. V. Mikhailov, and N. A. Stepanova, “Analysis of semantic relations in classification of sense images of statements,” Pattern Recognit. Image Anal. 17, 274–278 (2007). https://doi.org/10.1134/s1054661807020150
G. M. Emelyanov and D. V. Mikhailov, “Sense’s standards and machine understanding of texts in the system for computer-aided testing of knowledge,” Pattern Recognit. Image Anal. 21, 705–719 (2011). https://doi.org/10.1134/s1054661811040067
G. M. Emelyanov, D. V. Mikhailov, and A. P. Kozlov, “Relevance of a set of topical texts to a knowledge unit and the estimation of the closeness of linguistic forms of its expression to a semantic pattern,” Pattern Recognit. Image Anal. 28, 771–782 (2018). https://doi.org/10.1134/s1054661818040090
G. M. Emelyanov, D. V. Mikhailov, and A. P. Kozlov, “The TF-IDF measure and analysis of links between words within N-grams in the formation of knowledge units for open tests,” Pattern Recognit. Image Anal. 27, 825–831 (2017). https://doi.org/10.1134/s1054661817040058
G. M. Emelyanov, D. V. Mikhailov, and N. A. Stepanova, “Semantic relation analysis for classification of the meaning patterns of utterances,” Pattern Recognit. Image Anal. 15, 382–383 (2005).
G. M. Emelyanov, D. V. Mikhailov, and E. I. Zaitseva, “Recognition of superphrase unities in texts while establishing their semantic equivalence,” Pattern Recognit. Image Anal. 13, 447–451 (2003).
G. M. Emelyanov, D. V. Mikhailov, and E. I. Zaitseva, “Synonymic transformations in analysis of semantic pattern equivalence at the superphrase unity level,” Pattern Recognit. Image Anal. 13, 21–23 (2003).
G. M. Emelyanov and E. I. Smirnova, “Algebra of the logical simulation of hypersegment image databases,” Pattern Recognit. Image Anal. 10, 156–163 (2000).
G. M. Emelyanov and E. I. Smirnova, “Logical model of hypertext image database,” Pattern Recognit. Image Anal. 9, 458–491 (1999).
Demo version program system testing knowledge (Visual Prolog 5.2). http://www.machinelearning.ru/wiki/images/5/5b/Open_form_testing.rar. Cited July 11, 2022.
I. A. Mel’chuk, An Attempt at a Theory of “Meaning ↔ Text” Linguistic Models: Semantics, Syntax (Shkola Yazyki Russkoi Kul’tury, Moscow, 1999).
D. V. Mikhailov and G. M. Emelyanov, “Information-logical model of system of Δ-grammar,” Izv. S.-Peterb. Gos. Elektrotekh. Univ. LETI, Ser. Inf., Upr. Komp’yuternye Tekhnol. 3, 96–102 (2003).
D. V. Mikhailov, G. M. Emelyanov, and N. A. Stepanova, “Formation and clustering of noun contexts within the framework of splintered values,” Pattern Recognit. Image Anal. 19, 664–672 (2009). https://doi.org/10.1134/s1054661809040154
D. V. Mikhailov and G. M. Emel’yanov, “Semantic clustering and affinity measure of subject-oriented language texts,” Pattern Recognit. Image Anal. 20, 376–385 (2010). https://doi.org/10.1134/s1054661810030144
D. V. Mikhailov and G. M. Emel’yanov, “Semantic standards and knowledge transfer in the problem of knowledge assessment on the basis of open tests,” Pattern Recognit. Image Anal. 25, 223–229 (2015). https://doi.org/10.1134/s1054661815020170
D. V. Mikhaylov and G. M. Emelyanov, “Analysis of the mutual relevance of topical corpus documents in the problem of assessing the proximity of text to the semantic standard,” Pattern Recognit. Image Anal. 31, 588–594 (2021). https://doi.org/10.1134/s1054661821030172
D. V. Mikhaylov and G. M. Emelyanov, “Estimation of the closeness to a semantic pattern of a topical text without construction of periphrases,” Pattern Recognit. Image Anal. 29, 647–653 (2019). https://doi.org/10.1134/s1054661819040114
D. V. Mikhaylov and G. M. Emel’yanov, “Hierarchization of topical texts based on the estimate of proximity to the semantic pattern without paraphrasing,” Pattern Recognit. Image Anal. 30, 440–449 (2020). https://doi.org/10.1134/s1054661820030207
D. V. Mikhaylov and G. M. Emelyanov, Theoretical Foundations of the Synthesis of Open Question-Answering Systems: Semantic Equivalence of Texts and Models of Their Recognition: Monograph (Yaroslav-the-Wise Novgorod State University, Velikii Novgorod, 2010).
D. V. Mikhaylov, A. P. Kozlov, and G. M. Emelyanov, “An approach based on analysis of n-grams on links of words to extract the knowledge and relevant linguistic means on subject-oriented text sets,” Comput. Opt. 41, 461–471 (2017). https://doi.org/10.18287/2412-6179-2017-41-3-461-471
D. V. Mikhaylov, A. P. Kozlov, and G. M. Emelyanov, “An approach based on TF-IDF metrics to extract the knowledge and relevant linguistic means on subject-oriented text sets,” Comput. Opt. 39, 429–438 (2015). https://doi.org/10.18287/0134-2452-2015-39-3-429-438
D. V. Mikhaylov, A. P. Kozlov, and G. M. Emelyanov, “Extraction of knowledge and relevant linguistic means with efficiency estimation for the formation of subject-oriented text sets,” Komp’yuternaya Opt. 40, 572–582 (2016). https://doi.org/10.18287/2412-6179-2016-40-4-572-582
S. G. Sereda, “Methods of the decision rule optimization in the problem of segmentation of the hierarchical textures,” Pattern Recognit. Image Anal. 13, 165–167 (2003).
S. G. Sereda, S. A. Guzeev, and G. M. Emelyanov, “Interactive learning in texture segmentation,” Pattern Recognit. Image Anal. 6, 67–68 (1996).
S. G. Sereda, S. A. Guzeev, and G. M. Emelyanov, “Modeling of hierarchical textures and synthesis of algorithms for their segmentation,” Pattern Recognit. Image Anal. 8, 254–255 (1998).
S. G. Sereda and G. M. Emelyanov, “On constructing the features in the problem of image segmentation,” Pattern Recognit. Image Anal. 13, 168–169 (2003).
S. G. Sereda and G. M. Emelyanov, “Formation of notion system for texture description,” Pattern Recognit. Image Anal. 9, 181–183 (1999).
T. T. Tanimoto, An Elementary Mathematical Theory of Classification and Prediction (Int. Business Machines Corporation, New York, 1958).
Eclipse Foundation. https://www.eclipse.org. Cited July 11, 2022.
I. O. Titov and G. M. Emel’yanov, “ System of the computer vision moving air object,” Komp’yuternaya Opt. 35, 491–495 (2011).
D. A. Tsymbal, G. M. Emelyanov, D. V. Chebotarev, and A. N. Sergeev, “An algorithm of the multichannel texture segmentation (Gabor filters),” Pattern Recognit. Image Anal. 11, 256–257 (2001).
Funding
The work was carried out with partial support from the Russian Foundation for Basic Research (project no. 19-01-00006-a).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors of this work declare that they have no conflicts of interest.
Additional information
Gennady Martinovich Emelyanov. Born 1943. Graduated from the Ul’yanov (Lenin) Leningrad Institute of Electrical Engineering in 1966. Obtained his Cand. Sci. and his Dr. Sci. degrees in 1971 and 1990, respectively. From 1993 to 2003, a Dean of the Faculty of Mathematics and Computer Science at Yaroslav-the-Wise Novgorod State University. Now he is a Professor of the Department of Information Technologies and Systems at the same university. Scientific interests: construction of problem-oriented computing systems of image processing and analysis. He is the author of 103 publications in the field of pattern recognition and image analysis.
Dmitry Vladimirovich Mikhaylov. Born 1974. Graduated from the Yaroslav-the-Wise Novgorod State University, Novgorod, in 1997. Obtained his Cand. Sci. and his Dr. Sci. degrees in Physics and Mathematics in 2003 and 2013, respectively. From 2000 to 2007 has worked at the Department of Computer Software of Novgorod State University. Now he is a Professor of the Department of Information Technologies and Systems at the same university. Since 2002 has been a member of Russian Association for Pattern Recognition and Image Analysis. Scientific interests: computational linguistics and artificial intelligence. He has authored 48 papers in the scientific area of Pattern Recognition and Image Analysis.
Publisher’s Note.
Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Emelyanov, G.M., Mikhaylov, D.V. Theoretical Foundations, Methods, and Algorithms for Lossless-in-Sense Text Compression. Pattern Recognit. Image Anal. 33, 1657–1663 (2023). https://doi.org/10.1134/S1054661823040144
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1054661823040144