An Empirical Comparison of Translation Disambiguation Techniques for Chinese–English Cross-Language Information Retrieval
Disambiguation techniques are typically employed to reduce translation errors introduced during query translation in cross-lingual information retrieval. Previous work has used several techniques — based on term similarity, term co-occurrence, and language modelling. However, the previous experiments were conducted on different data sets, and thus the relative merits of each technique is presently unclear. The goal of this work is to compare the effectiveness of these techniques on the same Chinese–English data sets. Our results show that despite the different underlying models and formulae used, the aggregated results are comparable. However, there is wide variation in the translation of individual queries, suggesting that there is scope for further improvement.
KeywordsLanguage Modelling Query Term Query Translation Individual Query Candidate Translation
Unable to display preview. Download preview PDF.
- 3.Gao, J., Zhou, M., Nie, J.Y., He, H., Chen, W.: Resolving query translation ambiguity using a decaying co-occurrence model and syntactic dependence relations. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 183–190. ACM Press, New York (2002)CrossRefGoogle Scholar
- 6.Jang, M.G., Myaeng, S.H., Park, S.Y.: Using mutual information to resolve query translation ambiguities and query term weighting. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, College Park, Maryland, pp. 223–229 (1999)Google Scholar
- 10.Sanderson, M.: Word sense disambiguation and information retrieval. In: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 142–151. ACM Press, New York (1994)Google Scholar