Cross-Language Retrieval for the CLEF Collections — Comparing Multiple Methods of Retrieval

  • Fredric C. Gey
  • Hailing Jiang
  • Vivien Petras
  • Aitao Chen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2069)


For our participation in CLEF, the Berkeley group participated in the monolingual, multilingual and GIRT tasks. To help enrich the CLEF relevance set for future training, we prepared a manual reformulation of the original German queries which achieved excellent performance, more than 110% better than average of median precision. The GIRT task performed English-German Cross-Language IR by comparing commercial machine translation with thesaurus lookup techniques and query expansion techniques. Combining all techniques using simple data fusion produced the best results.


Query Term Stopword List Thesaurus Term Query Expansion Technique Median Precision 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    F Gey A Chen and H Jiang. Applying text categorization to vocabulary enhancement for japanese-english cross-language information retrieval. In S. Annandiou, editor, The Seventh Machine Translation Summit, Workshop on MT for Crosslanguage Information Retrieval, Singapore, pages 35–40, September 1999.Google Scholar
  2. [2]
    W Cooper A Chen and F Gey. Full text retrieval based on probabilistic equations with coefficients fitted by logistic regression. In D. K. Harman, editor, The Second Text REtrieval Conference (TREC-2), pages 57–66, March 1994.Google Scholar
  3. [3]
    A Chen J He L Xu F Gey and J Meggs. Chinese text retrieval without using a dictionary. In A. Desai Narasimhalu Nicholas J. Belkin and Peter Willett, ditors, Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Philadelphia, pages 42–49, 1997.Google Scholar
  4. [4]
    F. C. Gey and A. Chen. Phrase discovery for english and cross-language retrieval at trec-6. In D. K. Harman and Ellen Voorhees, editors, The Sixth Text REtrieval Conference (TREC-6), NIST Special Publication 500-240, pages 637–647, August 1998.Google Scholar
  5. [5]
    F. C. Gey and H. Jiang. English-german cross-language retrieval for the girt collection-exploiting a multilingual thesaurus. In Ellen Voorhees, editor, The Eighth Text REtrieval Conference (TREC-8), draft notebook proceedings, pages 219–234, November 1999.Google Scholar
  6. [6]
    J. He, L. Xu,, A. Chen, J. Meggs, and F. C. Gey. Berkeley chinese information retrieval at trec-5: Technical report. In D. K. Harman and Ellen Voorhees, editors, The Fifth Text REtrieval Conference (TREC-5), NIST Special Publication 500-238, pages 191–196, November 1996.Google Scholar
  7. [7]
    A Chen F Gey K Kishida H Jiang and Q Liang. Comparing multiple methods for japanese and japanese-english text retrieval. In N. Kando, editor, The First NTCIR Workshop on Japanese Text Retrieval and Term Recognition, Tokoyo Japan, pages 49–58, September 1999.Google Scholar
  8. [8]
    J. Purat. The World of Multilingual Environmental Thesauri., 1998.
  9. [9]
    D Eichmann M Ruiz and P Srinivasan. Cross-language information retrieval with the umls metathesaurus. In W B Croft A Moffat C J van Rijsbergen R Wilkinson and J Zobel, editors, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, pages 72–80, August 1998.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Fredric C. Gey
    • 1
  • Hailing Jiang
    • 2
  • Vivien Petras
    • 2
  • Aitao Chen
    • 2
  1. 1.UC Data Archive & Technical AssistanceUSA
  2. 2.School of Information Management and SystemsUniversity of CaliforniaCA 94720USA

Personalised recommendations