Disambiguation Strategies for Cross-Language Information Retrieval

  • Djoerd Hiemstra
  • Franciska de Jong
Conference paper

DOI: 10.1007/3-540-48155-9_18

Volume 1696 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Hiemstra D., de Jong F. (1999) Disambiguation Strategies for Cross-Language Information Retrieval. In: Abiteboul S., Vercoustre AM. (eds) Research and Advanced Technology for Digital Libraries. ECDL 1999. Lecture Notes in Computer Science, vol 1696. Springer, Berlin, Heidelberg

Abstract

This paper gives an overview of tools and methods for Cross-Language Information Retrieval (CLIR) that are developed within the Twenty-One project. The tools and methods are evaluated with the TREC CLIR task document collection using Dutch queries on the English document base. The main issue addressed here is an evaluation of two approaches to disambiguation. The underlying question is whether a lot of effort should be put in finding the correct translation for each query term before searching, or whether searching with more than one possible translation leads to better results? The experimental study suggests that the quality of search methods is more important than the quality of dis-ambiguation methods. Good retrieval methods are able to disambiguate translated queries implicitly during searching.

Keywords

Cross-Language Information Retrieval Statistical Machine Translation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Djoerd Hiemstra
    • 1
  • Franciska de Jong
    • 1
  1. 1.Centre for Telematics and Information TechnologyUniversity of TwenteEnschedeThe Netherlands