Skip to main content
Log in

Semantic Search Tools Based on Ontological Representations of Documentary Information

  • Information analysis
  • Published:
Automatic Documentation and Mathematical Linguistics Aims and scope

Abstract

Ontological means of identification and representation of text documents semantics in relation to the problems of interactive information retrieval are considered. The implementations of ontologies operations are presented, which allow forming images of new meanings in the subject area. Taxonomies of relations (paradigmatic and syntagmatic) and entities (polythematic thesaurus of concepts) are used to perform operations, as well as to identify fuzzy relationships (in the case when entities and relationships are specified at different levels of generality and/or expressed by different linguistic constructions). Interactive tools are proposed that use the operation of constructing aspect projections for graph representations of ontologies, which make it possible to reduce the dimension of the graph to a level acceptable from the point of view of display and perception. The possibilities of context-based use of entities and relationships for the development of the search process are considered. This paper discuses the use of ontology graphs of organizational processes as a “navigation map” that provides new entry points in the graphical interface, that is, objects that will be contextually defined patterns of search queries in similarity selection tasks or expert analysis tasks. This allows one not only to increase the completeness and accuracy of information retrieval, but also to make the scheme of search navigation more natural, bringing it closer to the schemes of understanding and synthesis of knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.

Similar content being viewed by others

Notes

  1. The main provisions and literature reviews on ontologies, the methodology for building and using ontologies, and the approaches to typifying relationships are presented in [1, 2], and [3], respectively.

  2. The xIRBIS Information and Analytical System is used by a number of leading information centers and organizations to create industrial document databases.

  3. The complexity of constructing graph representations of the processes of synthesis of new knowledge is also due to the fact that there are no predefined typical solution schemes.

  4. The conditions of user interaction with the system also obey the Miller principle [17], which reflects quantitative restrictions on the user’s ability to perceive and identify information.

  5. It follows from here that the system must have complete and deep knowledge represented by well-formalized objects. Today, these are thesauri, classifications, taxonomies, and domain ontologies.

  6. An ontology is defined as a set of three interconnected systems: O = ❬Sf, Sc, St❭. Sf is the functional system (objects and relationships of reality), which is defined as Sf = ❬Mf, Af, Rf, Zf❭, where Mf is a set of objects (entities), Af is a set of characteristic properties, and Rf is a set of functional relationships represented by typified situational connections that are typical for the SA, Zf is the law of composition, i.e., the rules and schemes for ordering the objects (for example, the taxonomy of the SA). Sc is the conceptual system, which is defined as Sc = ❬Mc, Ac, Rc, Zc❭, where Mc is a set of concepts of the SA, Ac is a set of signs of systematization of the concepts (the meronomy of SA), Rc is a set of relationships (primarily paradigmatic relationships) that define classes/subclasses, and Zc is the law of composition (representation scheme that defines what concepts will be included in classes as well as in what relationship and in what order they will be included in these classes). St is the terminological system, which is defined as St = ❬Mt, At, Rt, Zx❭, where Mt is a set of terms, At is a set of properties, Rt is a set of equivalence and inclusion relationships as well as linguistic relationships, and Zt is the law of composition (grammar); ≡ – is the operation of comparing the elements of different systems at the level of signs, which ensures their identity in the functional, conceptual and terminological systems.

  7. This definition also reflects the relativity of knowledge. The presence of the ontology of the law of composition in the definition means the possibility of the existence of multi-ontology, i.e., more than one ontology (holistic self-sufficient “conceptualizations” of the SA), which are defined based on one set of entities and relationships.

  8. Examples of ontology operations are also given in [20].

  9. We consider the graph G (V, E) = ❬V, E❭, where V is the set of vertices of the graph, and E is the set of its edges. The graph G has the property of a multi-graph if: \(\exists {{G}_{i}}{\kern 1pt} \left( {{{V}_{i}},{{E}_{i}}} \right) \subset G,{{G}_{j}}{\kern 1pt} \left( {{{V}_{j}},{{E}_{j}}} \right) \subset \)\(G{\kern 1pt} :\,\,{\kern 1pt} \left( {\left\{ {{{V}_{i}}} \right\}\, = \,\left\{ {{{V}_{j}}} \right\}} \right) \wedge \left( {\left\{ {{{E}_{i}}} \right\} \ne \left\{ {{{E}_{j}}} \right\}} \right).\) The graph G has the property of a meta-graph if: \(\exists {{V}_{j}} \in V:\left( {\left\{ {{{V}_{j}}} \right\} \subset \left\{ {V\backslash {{V}_{j}}} \right\}} \right)\).

  10. First of all, this refers to the functional ontology.

  11. It is also a managed context: through the specification of an aspect and/or a parameter of conceptual depth and/or width.

  12. Aspect ideas are one of the methodological foundations of the synthesis of new knowledge [22]. This model of knowledge synthesis as a self-organizing process is based on the structural feature of the system: a complex system can be described using a set of relatively independent aspect representations. The process of decomposition not only identifies and binds components, but also forms the decomposition scheme, i.e., a system of characteristic features, in accordance with which the decomposition is carried out.

  13. At this point, the ambiguity of representation (and, accordingly, the identification of the vertices) becomes obvious: the vertices must essentially have the same name (name of the concept itself), but the content will be different, since the meaning of the concept is also determined by the context formed by related concepts and type of relationships.

  14. The correspondence of relationships and linguistic constructions is presented in [3].

  15. The practical material was prepared by the graduate student of the National Research Nuclear University “MEPhI” T.G. Ismailov.

  16. The ontology is built automatically; however, Figure 3 shows the manually processed image of the graph, where meta-vertices are distinguished manually (tools for display and operations on meta-graphs as part of the xIRBIS AIS are under development).

REFERENCES

  1. Golitsyna, O.L., Maksimov, N.V., Okropishina, O.V., and Strogonov, V.I., The ontological approach to the identification of information in tasks of document retrieval, Autom. Doc. Math. Linguist., 2012, vol. 46, no. 3, pp. 125–132.

    Article  Google Scholar 

  2. Maksimov, N.V., The methodological basis of ontological documentary information modeling, Autom. Doc. Math. Linguist., 2018, vol. 52, no. 2, pp. 57–72.

    Article  Google Scholar 

  3. Maksimov, N.V., Gavrilkina, A.S., Andronova, V.V., and Tazieva, I.A., Systematization and identification of semantic relations in ontologies for scientific and technical subject areas, Autom. Doc. Math. Linguist., 2018, vol. 52, no. 6, pp. 306–317.

    Article  Google Scholar 

  4. Hoeber, O., Yang, X., and Yao, Y., Conceptual query expansion, Proc. of the Third Int. Conf. on Advances in Web Intelligence (June 6–9, 2005, Lodz, Poland), 2005, pp. 190–196.

  5. Hoeber, O., Yang, X., and Yao, Y., Visualization support for interactive query refinement, Proc. of the 2005 IEEE/WIC/ACM Int. Conf. on Web Intelligence (September 19–22, 2005, Compiegne, France), Washington, DC, 2005, pp. 19–22.

  6. Gorbun, E.S., Maksimov, N.V., Monankov, K.V., and Nizametdinov, Sh.U., Model and tools for interactive analysis of dynamics and relations of scientific information publications flows, Sci. Visualization, 2015, vol. 7, no. 5, pp. 12–25.

    Google Scholar 

  7. Hoeber, O., Information visualization for interactive information retrieval, Proc. of the 2018 Conference on Human Information Interaction & Retrieval (March 11–15, 2018, New Brunswick, NJ, USA), New York, NY, 2018, pp. 371–374.

  8. Steve, K., Don’t Make Me Think: A Common Sense Approach to Web Usability, Indianapolis: New Riders, 2000.

    Google Scholar 

  9. Cooper, A., Reimann, R., Cronin, D., and Noessel, C., About Face: The Essentials of Interaction Design, Indianapolis: John Wiley & Sons, Inc, 2014, 4th ed.

    Google Scholar 

  10. Rose, D. and Levinson, D., Understanding user goals in web search, Proc. of the 13th int. Conf. on World Wide Web (May 17–22, 2004, New York City, USA), New York, NY, 2004, pp. 13–19.

  11. Teevan, J., Dumais, S., and Horvitz, E., Personalizing search via automated analysis of interests and activities, Proc. of SIGIR (August 15–19, 2005, Salvador, Brazil), New York, NY, 2005, pp. 449–456.

  12. Hoeber, O. and Yang, X., Interactive web information retrieval using WordBars, Proc. of the 2006 IEEE/WIC/ACM Int. Conf. on Web Intelligence (December 18–22, 2006, Hong Kong, China), Washington, DC, 2006, pp. 875–882.

  13. IBM i2 Analyst’s Notebook. https://www.ibm.com/ru-ru/marketplace/analysts-notebook. Accessed April 4, 2019.

  14. Products built for a purpose. https://www.palantir.com. Accessed April 4, 2019.

  15. RCO Fact Extractor SDK. http://www.rco.ru/?page_ id=3554. Accessed April 4, 2019.

  16. Compreno. https://www.abbyy.com/ru-ru/science/ technologies/compreno/. Accessed April 4, 2019.

  17. Miller, G.A., The magical number seven, plus or minus two: Some limits on our capacity for processing information, Psychol. Rev., 1956, vol. 63, pp. 81–97.

    Article  Google Scholar 

  18. Maksimov, N.V., Information and knowledge: Nature and the conceptual model, Autom. Doc. Math. Linguist., 2010, vol. 44, no. 4, pp. 177–186.

    Article  Google Scholar 

  19. Golitsyna, O.L., Maksimov, N.V., Okropishina, O.V., and Okropishin, A.E., Semantic identification of text in the class of tasks of information retrieval, Proc. of ICAI'14, WORLDCOMP'14 (July 21–24, 2014, Las Vegas, Nevada, USA), 2014, vol. 1, pp. 47–52.

  20. Golitsyna, O.L., Maksimov, N.V., Okropishina, O.V., and Strogonov, V.I., An ontological approach to information identification in tasks of document retrieval: A practical application, Autom. Doc. Math. Linguist., 2013, vol. 47, no. 2, pp. 45–51.

    Article  Google Scholar 

  21. Maksimov, N.V., Golitsyna, O.L., Ganchenkova, M.G., Sanatov, D.V., and Razumov, A.V., The semantic core of the digital platform, Ontol. Proekt., 2018, vol. 8, no. 3, pp. 412–426.

    Google Scholar 

  22. Urmantsev, Yu.A., General system theory: State, applications, and development prospects, in Sistema, Simmetriya, Garmoniya (System, Symmetry, and Harmony), Moscow: Mysl’, 1988, pp. 38–124.

  23. Skobelev, P.O., Ontologies of activity for situational management of enterprises in real time, Ontol. Proekt., 2012, no. 1, pp. 6–38.

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to N. V. Maksimov, O. L. Golitsina, K. V. Monankov, A. A. Lebedev, N. A. Bal or S. G. Kyurcheva.

Ethics declarations

The authors declare that they have no conflict of interest.

Additional information

Translated by L. A. Solovyova

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Maksimov, N.V., Golitsina, O.L., Monankov, K.V. et al. Semantic Search Tools Based on Ontological Representations of Documentary Information. Autom. Doc. Math. Linguist. 53, 167–178 (2019). https://doi.org/10.3103/S0005105519040046

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S0005105519040046

Keywords:

Navigation