Experiments with Monolingual, Bilingual, and Robust Retrieval

  • Jacques Savoy
  • Samir Abdou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4730)


For our participation in the CLEF 2006 campaign, our first objective was to propose and evaluate a decompounding algorithm and a more aggressive stemmer for the Hungarian language. Our second objective was to obtain a better picture of the relative merit of various search engines for the French, Portuguese/Brazilian and Bulgarian languages. To achieve this we evaluated the test-collections using the Okapi approach, some of the models derived from the Divergence from Randomness (DFR) family and a language model (LM), as well as two vector-processing approaches. In the bilingual track, we evaluated the effectiveness of various machine translation systems for a query submitted in English and automatically translated into the French and Portuguese languages. After blind query expansion, the MAP achieved by the best single MT system was around 95% for the corresponding monolingual search when French was the target language, or 83% with Portuguese. Finally, in the robust retrieval task we investigated various techniques in order to improve the retrieval performance of difficult topics.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Savoy, J.: Combining Multiple Strategies for Effective Monolingual and Cross-Lingual Retrieval. IR Journal 7, 121–148 (2004)Google Scholar
  2. 2.
    Savoy, J.: Comparative Study of Monolingual and Multilingual Search Models for Use with Asian Languages. ACM Transactions on Asian Languages Information Processing 4, 163–189 (2005)CrossRefGoogle Scholar
  3. 3.
    Buckley, C., Singhal, A., Mitra, M., Salton, G.: New Retrieval Approaches Using SMART. In: Proceedings TREC-4, Gaithersburg, pp. 25–48 (1996)Google Scholar
  4. 4.
    Robertson, S.E., Walker, S., Beaulieu, M.: Experimentation as a Way of Life: Okapi at TREC. Information Processing & Management 36, 95–108 (2002)CrossRefGoogle Scholar
  5. 5.
    Amati, G., van Rijsbergen, C.J.: Probabilistic Models of Information Retrieval Based on Measuring the Divergence from Randomness. ACM Transactions on Information Systems 20, 357–389 (2002)CrossRefGoogle Scholar
  6. 6.
    Hiemstra, D.: Using Language Models for Information Retrieval. Ph.D. Thesis (2000)Google Scholar
  7. 7.
    Savoy, J., Berger, P.-Y.: Monolingual, Bilingual, and GIRT Information Retrieval. In: Peters, C., Gey, F.C., Gonzalo, J., Müller, H., Jones, G.J.F., Kluck, M., Magnini, B., de Rijke, M., Giampiccolo, D. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 131–140. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
    Savoy, J.: Report on CLEF-2003 Monolingual Tracks: Fusion of Probabilistic Models for Effective Monolingual Retrieval. In: Peters, C., Gonzalo, J., Braschler, M., Kluck, M. (eds.) CLEF 2003. LNCS, vol. 3237, pp. 322–336. Springer, Heidelberg (2004)Google Scholar
  9. 9.
    Savoy, J.: Statistical Inference in Retrieval Effectiveness Evaluation. Information Processing & Management 33, 495–512 (1997)CrossRefGoogle Scholar
  10. 10.
    Vogt, C.C., Cottrell, G.W.: Fusion via a Linear Combination of Scores. IR Journal 1, 151–173 (1999)Google Scholar
  11. 11.
    Voorhees, E.M.: Overview of the TREC 2004 Robust Retrieval Track. In: Proceedings TREC-2004, Gaithersburg, pp. 70–79 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Jacques Savoy
    • 1
  • Samir Abdou
    • 1
  1. 1.Computer Science Department, University of Neuchatel, Rue Emile Argand 11, 2009 NeuchatelSwitzerland

Personalised recommendations