Advertisement

An Empirical Comparison of Translation Disambiguation Techniques for Chinese–English Cross-Language Information Retrieval

  • Ying Zhang
  • Phil Vines
  • Justin Zobel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4182)

Abstract

Disambiguation techniques are typically employed to reduce translation errors introduced during query translation in cross-lingual information retrieval. Previous work has used several techniques — based on term similarity, term co-occurrence, and language modelling. However, the previous experiments were conducted on different data sets, and thus the relative merits of each technique is presently unclear. The goal of this work is to compare the effectiveness of these techniques on the same Chinese–English data sets. Our results show that despite the different underlying models and formulae used, the aggregated results are comparable. However, there is wide variation in the translation of individual queries, suggesting that there is scope for further improvement.

Keywords

Language Modelling Query Term Query Translation Individual Query Candidate Translation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ballesteros, L., Croft, W.B.: Resolving ambiguity for cross-language retrieval. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 64–71. ACM Press, New York (1998)CrossRefGoogle Scholar
  2. 2.
    Adriani, M.: Using statistical term similarity for sense disambiguationin crosslanguage information retrieval. Information Retrieval 2, 71–82 (2000)CrossRefGoogle Scholar
  3. 3.
    Gao, J., Zhou, M., Nie, J.Y., He, H., Chen, W.: Resolving query translation ambiguity using a decaying co-occurrence model and syntactic dependence relations. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 183–190. ACM Press, New York (2002)CrossRefGoogle Scholar
  4. 4.
    Hiemstra, D., de Jong, F.: Disambiguation strategies for cross-language information retrieval. In: Abiteboul, S., Vercoustre, A.-M. (eds.) ECDL 1999. LNCS, vol. 1696, pp. 274–293. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  5. 5.
    Kwok, K.L.: Exploiting a chinese-english bilingual wordlist for english-chinese cross language information retrieval. In: Proceedings of the 5th International Workshop on Information Retrieval with Asian Languages, pp. 173–179. ACM Press, New York (2000)CrossRefGoogle Scholar
  6. 6.
    Jang, M.G., Myaeng, S.H., Park, S.Y.: Using mutual information to resolve query translation ambiguities and query term weighting. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, College Park, Maryland, pp. 223–229 (1999)Google Scholar
  7. 7.
    Federico, M., Bertoldi, N.: Statistical cross-language information retrieval using n-best query translations. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 167–174. ACM Press, New York (2002)CrossRefGoogle Scholar
  8. 8.
    Zhang, Y., Vines, P., Zobel, J.: Chinese oov translation and post-translation query expansion in chinese–english cross-lingual information retrieval. ACM Transactions on Asian Language Information Processing 4, 57–77 (2005)CrossRefGoogle Scholar
  9. 9.
    Ney, H., Essen, U., Kneser, R.: On structuring probabilistic dependences in stochastic language modelling. Computer Speech and Language 8, 1–38 (1994)CrossRefGoogle Scholar
  10. 10.
    Sanderson, M.: Word sense disambiguation and information retrieval. In: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 142–151. ACM Press, New York (1994)Google Scholar
  11. 11.
    Smets, P.: The combination of evidence in the transferable belief model. IEEE Transaction on Pattern Analysis and Machine Intelligence 12, 447–458 (1990)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Ying Zhang
    • 1
  • Phil Vines
    • 1
  • Justin Zobel
    • 1
  1. 1.School of Computer Science and Information TechnologyRMIT UniversityMelbourneAustralia

Personalised recommendations