Local Query Expansion Using Terms Windows for Robust Retrieval

  • Angel F. Zazo
  • Jose L. Alonso Berrocal
  • Carlos G. Figuerola
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4730)

Abstract

This paper describes our work at CLEF 2006 Robust task. This is an ad-hoc task that explores methods for stable retrieval by focusing on poorly performing topics. We have participated in all subtasks: monolingual (English, French, Italian and Spanish), bilingual (Italian to Spanish) and multilingual (Spanish to [English, French, Italian and Spanish]). In monolingual retrieval we have focused our effort on local query expansion, i.e. using only the information from retrieved documents, not from the complete document collection or external corpora, such as the Web. Some local expansion techniques were applied for training topics. Regarding robustness the most effective one was the use of co-occurrence based thesauri, which were constructed using co-occurrence relations in windows of terms, not in complete documents. This is an effective technique that can be easily implemented by tuning only a few parameters. In bilingual and multilingual retrieval experiments several machine translation programs were used to translate topics. For each target language, translations were merged before performing a monolingual retrieval. We also applied the same local expansion technique. In multilingual retrieval, weighted max-min normalization was used to merge lists. In all the subtasks in which we participated our mandatory runs (using title and description fields of the topics) obtained very good rankings. Runs with short queries (only title field) also obtained high MAP and GMAP values using the same expansion technique.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Voorhees, E.M.: Overview of the TREC 2003 robust retrieval track. In: The Twelfth Text REtrieval Conference, NIST (2003)Google Scholar
  2. 2.
    Voorhees, E.M.: Overview of the TREC 2004 robust retrieval track. In: The Thirteen Text REtrieval Conference, NIST (2004)Google Scholar
  3. 3.
    Voorhees, E.M.: Overview of the TREC 2005 robust retrieval track. In: The Fourteenth Text REtrieval Conference, NIST (2005)Google Scholar
  4. 4.
    Singhal, A., Buckley, C., Mitra, M.: Pivoted document length normalization. In: Proceedings of the 19th Annual International ACM SIGIR Conference, pp. 21–29 (1996)Google Scholar
  5. 5.
    Zazo, A.F., Figuerola, C.G., Alonso Berrocal, J.L., Rodríguez, E., Gómez, R.: Experiments in term expansion using thesauri in Spanish. In: Peters, C., Braschler, M., Gonzalo, J. (eds.) CLEF 2002. LNCS, vol. 2785, pp. 301–310. Springer, Heidelberg (2003)Google Scholar
  6. 6.
    Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Angel F. Zazo
    • 1
  • Jose L. Alonso Berrocal
    • 1
  • Carlos G. Figuerola
    • 1
  1. 1.REINA Research Group – University of Salamanca, C/ Francisco Vitoria 6-16, 37008 SalamancaSpain

Personalised recommendations