Ontological Extraction of Content for Text Querying

  • Troels Andreasen
  • Per Anker Jensen
  • Jørgen Fischer Nilsson
  • Patrizia Paggio
  • Bolette Sandford Pedersen
  • Hanne Erdman Thomsen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2553)


This paper describes a method and a system ONTOQUERY for content-based querying of texts based on the availability of an ontology for the concepts in the text domain. A key principle in the system is the extraction of conceptual content of noun phrases into descriptors forming an integral part of the ontology.

The retrieval of text passages rests on matching descriptors from the text against descriptors from the noun phrases in the query. The match need not be exact but is mediated by the ontology, invoking in particular taxonomic reasoning with sub- and super concepts. The paper also reports on a prototype implementation of the system.


Noun Phrase Semantic Relation Lexical Entry Query Evaluation Prepositional Phrase 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abney, S.: Partial parsing via finite-state cascades. Proceedings of the ESSLLI’ 96 Robust Parsing Workshop, 1996. Available from:
  2. 2.
    Andreasen, T., Nilsson, J. Fischer & Thomsen, H. Erdman: Ontology-based Querying, in Larsen, H. L. et al. (eds.) Flexible Query Answering Systems, Flexible Query Answering Systems, Recent Advances, Physica-Verlag, Springer, 2000. pp. 15–26.Google Scholar
  3. 3.
    Andreasen, T.: Query evaluation based on domain-specific ontologies. In NAFIPS’2001, 20th IFSA / NAFIPS International Conference Fuzziness and Soft Computing, pp. 1844–1849, Vancouver, Canada, 2001.Google Scholar
  4. 4.
    Andreasen, T.: On knowledge-guided fuzzy aggregation. In IPMU’2002, 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, 1–5 July 2002, Annecy, FranceGoogle Scholar
  5. 5.
    Andreasen, T., Jensen, P. Anker, Nilsson, J. Fischer, Paggio, P., Pedersen, B. Sandford & Thomsen, H. Erdman: OntoQuery: Ontology-based Querying of Texts, to appear at AAAI 2002 Spring Symposium, Stanford, California, 2002.Google Scholar
  6. 6.
    Brill, E.: Transformation-based error-driven learning amd natural language processing: a case study in part-of-speech tagging. Computational Linguistics, 21(4), pp.5543–565, 1995.Google Scholar
  7. 7.
    Copestake, A.: The (new) LKB system-version 5.2. CSLI, Stanford, 1999.Google Scholar
  8. 8.
    Fillmore, C.: The Case for Case. In Bach, E. & R. Harms (eds.): Universals in Linguistic Theory, New York: Holt, Rinehart & Winston, 1968.Google Scholar
  9. 9.
    Gonzales J., Verdejo, F., PETERS, C., Calzolari, N. Applying EuroWordNet to Cross-lingual Text Retrieval, in Computers and the Humanities Vol. 32: 185–207, 1998. Kluwer Academic Publishers, The Netherlands.Google Scholar
  10. 10.
    Guarino, N., Masolo, C., & Vetere, G.: OntoSeek: Content-Based Access to the Web. IEEE Intelligent Systems, 14(3) (1999) 70–80.CrossRefGoogle Scholar
  11. 11.
    Jacquemin, C., and Tzoukermann, E.: NLP for Term Variant Extraction: A Synergy of Morphology, Lexicon and Syntax. In T. Strzalkowski (ed.), Natural Language Information Retrieval, pp. 25–74, Kluwer, Boston, MA, 1999.Google Scholar
  12. 12.
    Jacquemin, C., and Bourigault, D.: Term Extraction and Automatic Indexing. In R. Mitkov (ed.), Handbook of Computational Linguistics, Oxford University Press, Oxford, 2001.Google Scholar
  13. 13.
    Jensen, P. Anker & Skadhauge, P. (eds.): Proceedings of the First International OntoQuery Workshop Ontology-based interpretation of NP’s, Department of Business Communication and Information Science, University of Southern Denmark, Kolding, 2001, to be republished at
  14. 14.
    Jensen, P. Anker, Nilsson, J. Fischer & Vikner C.: Towards an Ontology-based Interpretation of Noun Phrases, In: P. A. Jensen & P. R. Skadhauge (eds.): Ontologybased Interpretation of Noun Phrases, in [13].Google Scholar
  15. 15.
    Keson, B.: Morfosyntaktisk tagging af danske tekster, in: P. Widell & M. Kunøe (eds.) 8. møde om Udforskning af Dansk Sprog, Århus Universitet, 1998.Google Scholar
  16. 16.
    Lenci, A., Bel, N., Busa, F., Calzolari, N., Gola, E., Monacini, M., Ogonowski, A., Peters, I., Peters, W., Ruimy, N., Villegas, M., Zampolli, A.: SIMPLE-A General Framework for the Development of Multilingual Lexicons, in: T. Fontenelle (ed.) International Journal of Lexicography Vol 13. pp. 249–263. 2000. Oxford University Press.Google Scholar
  17. 17.
    Madsen, B. Nistrup, Pedersen B. Sandford & Thomsen, H. Erdman: “Semantic Relations in Content-based Querying Systems: a Research Presentation from the OntoQuery Project”. In: K. Simov and A. Kiryakov (Eds.): Proceedings of OntoLex’ 2000: Ontologies and Lexical Knowledge Bases. OntoText Lab.,Sofia 2001. Forthcoming.Google Scholar
  18. 18.
    Nilsson, J. Fischer: A Logico-algebraic Framework for Ontologies ONTOLOG, in [13].Google Scholar
  19. 19.
    Nilsson, J. Fischer: Concept Descriptions for Text Search, in Proceedings from the 11th European-Japanese Conference on Information Modelling and Knowledge Bases, Maribor, Slovenia, 2001, to be republished in the series: Information Modelling and Knowledge Bases, IOS press.Google Scholar
  20. 20.
    Nilsson, J. Fischer: Are there Conceptual Grammars?, panel contribution at the 11th European-Japanese Conference on Information Modelling and Knowledge Bases, Maribor, Slovenia, 2001, forthcoming in the series: Information Modelling and Knowledge Bases, IOS press, 2002.Google Scholar
  21. 21.
    Nilsson, J. Fischer: Generative Ontologies, Ontological Types and Conceptual Grammars, in [31].Google Scholar
  22. 22.
    Nuopponen, A.: Concept Systems for terminological analysis. Acta Wasaensia, No.38. Universitas Wasaensis, Wasa, 1994.Google Scholar
  23. 24.
    Paggio, P., Pedersen B. S. & Haltrup, D:: Applying Language Technology to Content-based Querying-The OntoQuery Project, forthcoming in Proceedings of Nordiske Datalingvistikdage, NoDaLiDa 2001, Uppsala, Sweden.Google Scholar
  24. 25.
    Paggio, P.: Parsing in ontoquery-experiments with LKB, in [13].Google Scholar
  25. 26.
    Pedersen, B. S & Keson, B.: SIMPLE Semantic Information for Multifunctional Plurilingual Lexicons: Some Examples of Danish Concrete Nouns, in: SIGLEX 99: Standardisaing Lexical Resources pp.46–51, ACL Workshop, 1999. University of Maryland, USA.Google Scholar
  26. 27.
    Pedersen, B. S. & Nimb, S.: Semantic Encoding of Danish Verbs in SIMPLE Adapting a verb-framed model to a satellite-framed language SiÉ in Proceedings from 2nd Internal Conference on Language Resources and Evaluation, pp. 1405–1412. Athens, Greece.Google Scholar
  27. 28.
    Pedersen, B. S. & Paggio, P.: A Danish Semantic Lexicon and its Application in Content-based Querying, submitted for review to Bouillon & Viegas (eds.) Semantic Lexicons in Natural Language Processing, Special Issue of T.A.L., Hermes, France.Google Scholar
  28. 29.
    Pustejovsky, J.: The Generative Lexicon, MIT press, 1995.Google Scholar
  29. 30.
    Smeaton, A. & Quigley, A: Experiments on Using Semantic Distances between Words in IMage Caption Retrieval, in Proceedings of the 19th International Conference on Research Development in IR, 1996.Google Scholar
  30. 31.
    Thomsen, Hanne Erdman (ed.): Ontologies and Search-2nd OntoQueryWorkshop January 2000, LAMBDA, nr.28, HHK (Copenhagen Business School), Frederiksberg, 2001.Google Scholar
  31. 32.
    Voorhees, E. M.: Using WordNet to disambiguate word senses for text retrieval, in Korfhage, R., Rasmussen, E., and Willett P. eds., Proceedings of the 16th Annual ACM SIGIR Conference on Research and Development in Information Retrieval, Pittsburgh, 1993, pp. 171–180.Google Scholar
  32. 33.
    Voorhees, E. M.: Query expansion using lexical-semantic relations. In Croft, W. Bruce and C. J. van Rijsbergen, eds., Proceedings of the 17th Annual ACM SIGIR Conference on Research and Development in Information Retrieval, 1994, pp. 61–69.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Troels Andreasen
    • 1
  • Per Anker Jensen
    • 2
  • Jørgen Fischer Nilsson
    • 3
  • Patrizia Paggio
    • 4
  • Bolette Sandford Pedersen
    • 4
  • Hanne Erdman Thomsen
    • 5
  1. 1.Computer ScienceRoskilde UniversityRoskildeDenmark
  2. 2.Business Communication and Information ScienceUniversity of Southern DenmarkDenmark
  3. 3.Informatics and Mathematical ModellingTechnical University of DenmarkDenmark
  4. 4.Centre for Language TechnologyCopenhagenDenmark
  5. 5.Computational Linguistics, Copenhagen Business SchoolDenmark

Personalised recommendations