Advertisement

Combination Methods for Crosslingual Web Retrieval

  • Jaap Kamps
  • Maarten de Rijke
  • Börkur Sigurbjörnsson
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4022)

Abstract

We investigate a range of crosslingual web retrieval tasks using the test suite of the CLEF 2005 WebCLEF track, which features a stream of known-item topics in various languages. Our main findings are: (i) straightforward indexing and retrieval is effective for mixed monolingual web retrieval; (ii) standard machine translation methods are effective for bilingual web retrieval; but (iii) standard combination methods are ineffective for multilingual web retrieval; we analyze the failure and suggest an alternative Z-score normalization that leads to effective multilingual retrieval results.

Keywords

Machine Translation Combination Method Round Robin Relevant Page Reciprocal Rank 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Craswell, N., Hawking, D.: Overview of the TREC-2004 Web Track. In: Proceedings TREC 2004 (2005)Google Scholar
  2. 2.
    Fox, E., Shaw, J.: Combination of multiple searches. In: The Second Text REtrieval Conference (TREC-2). National Institute for Standards and Technology. NIST Special Publication 500-215, pp. 243–252 (1994)Google Scholar
  3. 3.
    ILPS. The ILPS extension of the Lucene search engine (2005), http://ilps.science.uva.nl/Resources/
  4. 4.
    Kamps, J.: Web-centric language models. In: CIKM 2005: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 307–308 (2005)Google Scholar
  5. 5.
    Kamps, J., de Rijke, M.: The effectiveness of combining information retrieval strategies for European languages. In: Proceedings of the 2004 ACM Symposium on Applied Computing, pp. 1073–1077 (2004)Google Scholar
  6. 6.
    Kamps, J., Monz, C., de Rijke, M., Sigurbjörnsson, B.: Language-Dependent and Language-Independent Approaches to Cross-Lingual Text Retrieval. In: Peters, C., Gonzalo, J., Braschler, M., Kluck, M. (eds.) CLEF 2003. LNCS, vol. 3237, pp. 152–165. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  7. 7.
    Kamps, J., Fissaha Adafre, S., de Rijke, M.: Effective translation, tokenization and combination for cross-lingual retrieval. In: Multilingual Information Access for Text, Speech and Images: Results of the Fifth CLEF Evaluation Campaign, pp. 123–134 (2005)Google Scholar
  8. 8.
    Lee, J.: Combining multiple evidence from different properties of weighting schemes. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 180–188 (1995)Google Scholar
  9. 9.
    Lucene. The Lucene search engine (2005), http://lucene.apache.org/
  10. 10.
    Ogilvie, P., Callan, J.: Combining document representations for known-item search. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 143–150 (2003)Google Scholar
  11. 11.
    Savoy, J.: Report on CLEF-2003 Multilingual Tracks. In: Peters, C., Gonzalo, J., Braschler, M., Kluck, M. (eds.) CLEF 2003. LNCS, vol. 3237, pp. 64–73. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  12. 12.
    Worldlingo. Online translator (2005), http://www.worldlingo.com/

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jaap Kamps
    • 1
    • 2
  • Maarten de Rijke
    • 2
  • Börkur Sigurbjörnsson
    • 2
  1. 1.Archives and Information Science, Faculty of HumanitiesUniversity of Amsterdam 
  2. 2.ISLA, Faculty of ScienceUniversity of Amsterdam 

Personalised recommendations