Abstract
We describe the Eurospider component for Cross-Language Information Retrieval (CLIR) that has been employed for experiments at all three CLEF campaigns to date. The central aspect of our efforts is the use of combination approaches, effectively combining multiple language pairs, translation resources and translation methods into one multilingual retrieval system. We discuss the implications of building a system that allows flexible combination, give details of the various translation resources and methods, and investigate the impact of merging intermediate results generated by the individual steps. An analysis of the resulting combination system is given which also takes into account additional requirements when deploying the system as a component in an operational, commercial setting.
Article PDF
Similar content being viewed by others
References
Ballesteros L and Croft B (1996) Dictionary methods for cross-lingual information retrieval. In: Proceedings of the 7th International DEXA Conference on Database and Expert Systems, Sept. 9-13, Zurich, Switzerland. Springer, pp. 791–801.
Bartell BT, Cottrell GW and Belew RK (1994) Automatic combination of multiple ranked retrieval systems. In: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 173-181.
Belkin NJ, Cool C, Croft WB and Callan JP (1993a) The effect of multiple query representations on information retrieval system performance. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 339-346.
Belkin NJ, Kantor P, Cool C and Quatrain R (1993b) Combining evidence for information retrieval. In: The Second Text REtrieval Conference (TREC-2), NIST Special Publication 500-215, pp. 35-43.
Brand R and Brünner M (2002) Océ at CLEF 2002. In: Working Notes for the CLEF 2002 Workshop, pp. 21-30.
Braschler M, Krause J, Peters C and Schäuble P (1998a) Cross-language information retrieval (CLIR) track overview. In: The Seventh Text REtrieval Conference (TREC-7), NIST Special Publication 500-242, pp. 1-8.
Braschler M, Mateev B, Mittendorf E, Schäuble P and Wechsler M (1998b) SPIDER retrieval system at TREC7. In: The Seventh Text REtrieval Conference (TREC-7), NIST Special Publication 500-242, pp. 446-454.
Braschler M and Schäuble P (1998) Multilingual information retrieval based on document alignment techniques. In: Research and Advanced Technology for Digital Libraries, Second European Conference, ECDL '98, Lecture Notes in Computer Science, Vol. 1513, Springer, pp. 183-197.
Braschler M and Schäuble P (2001) Experiments with the eurospider retrieval system for CLEF 2000, CLEF 2000. In: Cross-Language Information Retrieval and Evaluation, Workshop of the Cross-Language Evaluation Forum, CLEF 2000, Lecture Notes in Computer Science, Vol. 2069, Springer, pp. 140-148.
Braschler M, Ripplinger B and Schäuble P (2002a) Experiments with the eurospider retrieval system for CLEF 2001. In: Evaluation of Cross-Language Information Retrieval Systems, Second Workshop of the Cross-Language Evaluation Forum, CLEF 2001, Lecture Notes in Computer Science, Vol. 2406, Springer, 2002, pp. 102-110.
Braschler M, Göhring A and Schäuble P (2002b) Eurospider at CLEF 2002. In:Working Notes for the CLEF 2002 Workshop, pp. 127-132.
Callan JP, Lu Z and Croft WB (1995) Searching distributed collections with inference networks. In: Proceedings of the 18th Annual International ACMSIGIR Conference on Research and Development in Information Retrieval, pp. 21-28.
Chen A (2002a) Multilingual information retrieval using English and Chinese queries. In: Evaluation of Cross-Language Information Retrieval Systems, Second Workshop of the Cross-Language Evaluation Forum, CLEF 2001, Lecture Notes in Computer Science, Vol. 2406, Springer, pp. 44-58.
Chen A (2002b) Cross-language retrieval experiments at CLEF 2002. In: Working Notes for the CLEF 2002 Workshop, pp. 5-20.
DuA and Callan J (1998) Probing a collection to discover its language model. Technical Report 98-29, Department of Computer Science, University of Massachusetts.
Figuerola CG, Berrocal JLA, Zazo AF and Díaz RG (2001) A simple approach to the Spanish-English bilingual retrieval task. In: Cross-Language Information Retrieval and Evaluation, Workshop of the Cross-Language Evaluation Forum, CLEF 2000, Lecture Notes in Computer Science, Vol. 2069, Springer, pp. 224-229.
Fox EA, Koushik MP, Shaw J, Modlin R and Rao D (1992) Combining evidence from multiple searches. In: The First Text REtrieval Conference (TREC-1), NIST Special Publication 500-207, pp. 319-328.
Fox EA and Shaw JA (1993) Combination of multiple searches. In: The Second Text REtrieval Conference (TREC-2), NIST Special Publication 500-215, pp. 243-252.
Gey F, Jiang H, Petras V and Chen A (2001) Cross-language retrieval for the CLEF collections-comparing multiple methods of retrieval. In: Cross-Language Information Retrieval and Evaluation, Workshop of the Cross-Language Evaluation Forum, CLEF 2000, Lecture Notes in Computer Science, Vol. 2069, Springer, pp. 116-128.
Gey FC, Jiang H and Perelman N (2002) Working with Russian queries for the GIRT, bilingual and multilingual CLEF tasks. In: Evaluation of Cross-Language Information Retrieval Systems, SecondWorkshop of the Cross-Language Evaluation Forum,CLEF2001, Lecture Notes in Computer Science,Vol. 2406, Springer, pp. 235-243.
Grefenstette G (1998) Cross-Language Information Retrieval. Kluwer Academic Publishers.
Hedlund T, Keskustalo H, Pirkola A, Sepponen M and Järvelin K (2001) Bilingual tests with Swedish, Finnish, and German queries: Dealing with morphology, compound words, and query structure. In: Cross-Language Information Retrieval and Evaluation,Workshop of the Cross-Language Evaluation Forum, CLEF 2000, Lecture Notes in Computer Science, Vol. 2069, Springer, pp. 210-223.
Hiemstra D, Kraaij W and Pohlmann R (2001) Translation resources, merging strategies, and relevance feedback for cross-language information retrieval. In: Cross-Language Information Retrieval and Evaluation, Workshop of the Cross-Language Evaluation Forum, CLEF 2000, Lecture Notes in Computer Science,Vol. 2069, Springer, pp. 102-115.
Jones GJF and Lam-Adesina AM (2002) Exeter at CLEF 2001: Experiments with machine translation for bilingual retrieval. In: Evaluation of Cross-Language Information Retrieval Systems, Second Workshop of the Cross-Language Evaluation Forum, CLEF 2001, Lecture Notes in Computer Science, Vol. 2406, Springer, pp. 59-77.
Lee JH (1995) Combining multiple evidence from different properties of weighting schemes. In: Proceedings of the 18th Annual International ACMSIGIR Conference on Research and Development in Information Retrieval, pp. 180-188.
McNamee P and Mayfield J (2002a) JHU/APL experiments at CLEF: Translation resources and score normalization. In: Evaluation of Cross-Language Information Retrieval Systems, Second Workshop of the Cross-Language Evaluation Forum, CLEF 2001, Lecture Notes in Computer Science, Vol. 2406, Springer, pp. 193-208.
McNamee P and Mayfield J (2002b) Scalable multilingual information access. In: Working Notes for the CLEF 2002 Workshop, pp. 133-140.
Moulinier I and Molina-Salgado H (2002) Thomson legal and regulatory experiments for CLEF 2002. In: Working Notes for the CLEF 2002 Workshop, pp. 91-96.
Nie JY and Simard M (2002) Using statistical translation models for bilingual IR. In: Evaluation of Cross-Language Information Retrieval Systems, Second Workshop of the Cross-Language Evaluation Forum, CLEF 2001, Lecture Notes in Computer Science, Vol. 2406, Springer, pp. 137-150.
Oard DW (1997) Alternative approaches for crosslanguage text retrieval. In:AAAI Symposium on Cross-Language Text and Speech Retrieval. American Association for Artificial Intelligence.
Qiu Y and Frei HP (1993) Concept based query expansion. In: Proceedings of the 16th ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 160-169.
Rogati M and Yang Y (2002) Cross-lingual pseudo-relevance feedback using a comparable corpus. In: Evaluation of Cross-Language Information Retrieval Systems, Second Workshop of the Cross-Language Evaluation Forum, CLEF 2001, Lecture Notes in Computer Science, Vol. 2406, Springer, pp. 151-157.
Salton G (1970) Automatic processing of foreign language documents. Journal of the American Society for Information Science, 21:187–194.
Saracevic T and Kantor P (1988) A study of information seeking and retrieving. III. Searchers, searches and overlap. Journal of the American Society for Information Science, 39:197–216.
Savoy J (2002a) Report on CLEF-2001 experiments: Effective combined query-translation approach. In: Evaluation of Cross-Language Information Retrieval Systems, Second Workshop of the Cross-Language Evaluation Forum, CLEF 2001, Lecture Notes in Computer Science, Vol. 2406, Springer, pp. 27-43.
Savoy J (2002b) Report on CLEF-2002 experiments: Combining multiple sources of evidence. In:Working Notes for the CLEF 2002 Workshop, pp. 31-46.
Sheridan P, Braschler M and Schäuble P (1997) Cross-language information retrieval in a multilingual legal domain. In: Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries, pp. 253-268.
Singhal A, Buckley C and Mitra M (1996) Pivoted document length normalization. In: Proceedings of the 19th ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 21-29.
Voorhees EM, Gupta NK and Johnson-Laird B (1995) Learning collection fusion strategies. In: Proceedings of the 18th Annual International ACMSIGIR Conference on Research and Development in Information Retrieval, pp. 172-179.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Braschler, M. Combination Approaches for Multilingual Text Retrieval. Information Retrieval 7, 183–204 (2004). https://doi.org/10.1023/B:INRT.0000009445.19495.46
Issue Date:
DOI: https://doi.org/10.1023/B:INRT.0000009445.19495.46