CLEF 2003: Comparative Evaluation of Multilingual Information Access Systems pp 21-28 | Cite as
Analysis of the Reliability of the Multilingual Topic Set for the Cross Language Evaluation Forum
Conference paper
Abstract
The reliability of the topics within the Cross Language Evaluation Forum (CLEF) needs to be validated constantly to justify the efforts for experiments within CLEF and to demonstrate the reliability of the results as far as possible. The analysis presented in this paper is concerned with several aspects. Continuing and expanding a study from 2002, we investigate the difficulty of topics and the correlation between the retrieval quality for topics and the occurrence of proper names.
Keywords
Information Retrieval Average Precision Retrieval Performance Information Retrieval System Linguistic Phenomenon
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Preview
Unable to display preview. Download preview PDF.
References
- 1.Sparck Jones, K.: Reflections on TREC. Information Processing & Management 31(3), 291–314 (1995)CrossRefGoogle Scholar
- 2.Kluck, M., Womser-Hacker, C.: Inside the Evaluation Process of the Cross-Language Evaluation Forum (CLEF): Issues of Multilingual Topic Creation and Multilingual Relevance Assessment. In: Rodríguez, M.G., Araujo, C.P.S. (eds.) Proceedings of the Third International Conference on Language Resources and Evaluation, LREC 2002. Las Palmas de Gran Canaria, May 29-31, 2002, pp. 573–576. ELRA, Paris (2002)Google Scholar
- 3.Womser-Hacker, C.: Multilingual Topic Generation within the CLEF 2001 Experiments. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds.) CLEF 2001. LNCS, vol. 2406, pp. 389–393. Springer, Heidelberg (2002)CrossRefGoogle Scholar
- 4.Zobel, J.: How Reliable are the Results of Large-Scale Information Retrieval Experiments? In: Proceedings of the Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR 1998), Melbourne, pp. 307–314 (1998)Google Scholar
- 5.Voorhees, E., Buckley, C.: The Effect of Topic Set Size on Retrieval Experiment Error. In: Proceedings of the Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR 2002), Tampere, Finland, pp. 316–323 (2002)Google Scholar
- 6.Soboroff, I., Nicholas, C., Cahan, P.: Ranking Retrieval Systems without Relevance Judgments. In: Proceedings of the Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR 2001), New Orleans, pp. 66–73 (2001)Google Scholar
- 7.Voorhees, E.: Variations in relevance judgments and the measurement of retrieval effectiveness. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1998), Melbourne, pp. 315–223 (1998)Google Scholar
- 8.Voorhees, E., Harman, D.: Overview of the Sixth Text REtrieval Conference. In: Voorhees, E., Harman, D. (eds.) The Sixth Text REtrieval Conference (TREC-6). National Institute of Standards and Technology, Gaithersburg, Maryland, NIST Special Publication (1997), http://trec.nist.gov/pubs/
- 9.Eguchi, K., Kuriyama, K., Kando, N.: Sensitivity of IR Systems Evaluation to Topic Difficulty. In: Rodríguez, M.G., Araujo, C.P.S. (eds.) Proceedings of the Third International Conference on Language Resources and Evaluation, LREC 2002, Las Palmas de Gran Canaria, May 29-31, 2002, pp. 585–589. ELRA, Paris (2002)Google Scholar
- 10.Mandl, T., Womser-Hacker, C.: Linguistic and Statistical Analysis of the CLEF Topics. In: Peters, C., Braschler, M., Gonzalo, J. (eds.) CLEF 2002. LNCS, vol. 2785, pp. 505–511. Springer, Heidelberg (2003)CrossRefGoogle Scholar
- 11.Lempel, R., Moran, S.: Predictive Caching and Prefetching of Query Results in Search Engines. In: Proceedings of the Twelfth International World Wide Web Conference (WWW 2003), Budapest, pp. 19–28. ACM Press, New York (2003)CrossRefGoogle Scholar
Copyright information
© Springer-Verlag Berlin Heidelberg 2004