Use of Dependency Microcontexts in Information Retrieval

  • Martin Holub
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1963)


This paper focuses especially on two problems that are crucial for retrieval performance in information retrieval (IR) systems: the lack of information caused by document pre-processing and the difficulty caused by homonymous and synonymous words in natural language. Author argues that traditional IR methods, i. e. methods based on dealing with individual terms without considering their relations, can be overcome using natural language processing (NLP). In order to detect the relations among terms in sentences and make use of lemmatisation and morphological and syntactic tagging of Czech texts, author proposes a method for construction of dependency word microcontexts fully automatically extracted from texts, and several ways how to exploit the microcontexts for the sake of increasing retrieval performance.


Information Retrieval Natural Language Processing Retrieval Performance Ambiguous Word Word Sense 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    E. Brill, R. J. Mooney: An Overview of Empirical Natural Language Processing. In: AI Magazine, Vol. 18 (1997), No. 4.Google Scholar
  2. 2.
    M. Holub, A. Böhmová: Use of Dependency Tree Structures for the Microcontext Extraction. Accepted for the ACL’2000 conference. 350Google Scholar
  3. 3.
    R. Krovetz, W. B. Croft: Lexical ambiguity and information retrieval. In: ACM Transactions on Information Systems, 10(2), 1992, pp 115–141. 350CrossRefGoogle Scholar
  4. 4.
    C. Leacock, G. Towell, E. M. Voorhees: Toward building contextual representations of word senses using statistical models. In: B. Boguraev and J. Pustejovsky (editors), Corpus Processing for Lexical Acquisitions, 1996, pp 97–113, MIT Press. 350Google Scholar
  5. 5.
    D. Lin: Extracting Collocations from Text Corpora. In: Computerm’ 98. Proceedings of the First Workshop on Computational Terminology. Montreal, 1998. 352Google Scholar
  6. 6.
    G. A. Miller, W. G. Charles: Contextual correlates of semantic similarity. In: Language and cognitive processes, 6(1), 1991. 350Google Scholar
  7. 7.
    H. Schütze, J. O. Pedersen: Information Retrieval Based on Word Senses. In: Proceedings of the Fourth Annual Symposium on Document Analysis and Information retrieval, pp 161–175, Las Vegas, NV, 1995. 350Google Scholar
  8. 8.
    G. Towell, E. M. Voorhees: Disambiguating Highly Ambiguous Words. In: Computational Linguistics, March 1998, Vol. 24, Number 1, pp 125–145. 350Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Martin Holub
    • 1
  1. 1.Department of Software Engineering, Faculty of Mathematics and PhysicsCharles UniversityPragueCzech republic

Personalised recommendations