Exploiting Multiple Translation Resources for English-Persian Cross Language Information Retrieval
One of the most important issues in Cross Language Information Retrieval (CLIR) which affects the performance of CLIR systems is how to exploit available translation resources. This issue can be more challenging when dealing with a language that lacks appropriate translation resources. Another factor that affects the performance of a CLIR system is the degree of ambiguity of query words. In this paper, we propose to combine different translation resources for CLIR. We also propose two different methods that exploit phrases in the query translation process to solve the problem of ambiguousness of query words. Our evaluation results on English-Persian CLIR show the superiority of phrase based and combinational translation CLIR methods over other CLIR methods.
KeywordsCross Language Information Retrieval English-Persian CLIR Phrase Based Query Translation Combining Translation Resources for CLIR
Unable to display preview. Download preview PDF.
- 2.Brown, P.F., Pietra, V.J.D., Pietra, S.A.D., Mercer, R.L.: The mathematics of statistical machine translation: parameter estimation. Comput. Linguist. 19(2), 263–311 (1993)Google Scholar
- 3.Hashemi, H.B.: Using Comparable Corpora for English-Persian Cross-Language Information Retrieval. Master’s thesis, University of Tehran, Tehran, Iran (2011)Google Scholar
- 6.Nie, J.Y.: Cross-Language Information Retrieval. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers (2010)Google Scholar
- 7.Nie, J.Y., Isabelle, P., Plamondon, P., Foster, G.: Using a probabilistic translation model for cross-language information retrieval. In: 6th Workshop on Very Large Corpora, pp. 18–27 (1998)Google Scholar