Abstract
We present two methods for estimating replacement probabilities without using parallel corpora. The first method proposed exploits the possible translation probabilities latent in Machine Readable Dictionaries (MRD). The second method is more robust, and exploits context similarity-based techniques in order to estimate word translation probabilities using the Internet as a bilingual comparable corpus. The experiments show a statistically significant improvement over non weighted structured queries in terms of MAP by using the replacement probabilities obtained with the proposed methods. The context similarity-based method is the one that yields the most significant improvement.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Pirkola, A.: The Effects of Query Structure and Dictionary Setups in Dictionary-Based Cross-Language Information Retrieval. In: SIGIR 1998, pp. 55–63 (1998)
Ballesteros, L., Croft, W.B.: Resolving Ambiguity for Cross-Language Retrieval. In: SIGIR 1998, pp. 64–71 (1998)
Hiemstra, D., De Jong, F.: Statistical Language Models and Information Retrieval: natural language processing really meets retrieval. University of Twente (2001)
Darwish, K., Oard, D.W.: Probabilistic structured Query Methods. In: SIGIR 2003 (2003)
Fung, P., Yuen Yee, L.: An IR Approach for Translating New Words from Nonparallel, Comparable Texts. In: COLING-ACL (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Saralegi, X., Lopez de Lacalle, M. (2010). Estimating Translation Probabilities from the Web for Structured Queries on CLIR. In: Gurrin, C., et al. Advances in Information Retrieval. ECIR 2010. Lecture Notes in Computer Science, vol 5993. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12275-0_53
Download citation
DOI: https://doi.org/10.1007/978-3-642-12275-0_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12274-3
Online ISBN: 978-3-642-12275-0
eBook Packages: Computer ScienceComputer Science (R0)