SINAI at CLEF Ad-Hoc Robust Track 2007: Applying Google Search Engine for Robust Cross-Lingual Retrieval

  • F. Martínez-Santiago
  • A. Montejo-Ráez
  • M. A. García-Cumbreras
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5152)


We report our web-based query generation experiments for English and French collections in the Robust task of the CLEF Ad-Hoc track. We continued with the approach adopted in the previous year, although the model has been modified. Last year we used Google to expand the original query. This year we create a new expanded query in addition to the original one. Thus, we retrieve two lists of relevant documents, one for each query (the original and the expanded one). In order to integrate the two lists of documents, we apply a logistic regression merging solution. The results obtained are discouraging but the failure analysis shows that very difficult queries are improved by using both queries instead of the original query. The problem is to decide when a query is very difficult.


Noun Phrase Expanded Query Pension Scheme Prepositional Phrase Original Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kwok, K.L., Grunfeld, L., Lewis, D.D.: TREC-3 ad-hoc, routing retrieval and thresholding experiments using PIRCS. In: Proceedings of TREC’3, vol. 500-215, pp. 247–255. NIST Special Publication (1995)Google Scholar
  2. 2.
    Martínez-Santiago, F., Montejo-Ráez, A., García-Cumbreras, M.A., Ureña-López, L.A.: SINAI at CLEF 2006 Ad Hoc Robust Multilingual Track: Query Expansion using the Google Search Engine Evaluation of Multilingual and Multi-modal Information Retrieval. In: Peters, C., Clough, P., Gey, F.C., Karlgren, J., Magnini, B., Oard, D.W., de Rijke, M., Stempfhuber, M. (eds.) CLEF 2006. LNCS, vol. 4730. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  3. 3.
    Voorhees, E., Gupta, N.K., Johnson-Laird, B.: The Collection Fusion Problem. In: Proceedings of the 3th Text Retrieval Conference TREC-3, vol. 500-225, pp. 95–104. NIST Special Publication (1995)Google Scholar
  4. 4.
    Martínez Santiago, F., Ureña López, L.A., Martín-Valdivia, M.T.: A merging strategy proposal: The 2-step retrieval status value method. Information Retrieval 9, 71–93 (2006)CrossRefGoogle Scholar
  5. 5.
    Savoy, J.: Combining Multiple Strategies for Effective Cross-Language Retrieval. Information Retrieval 7, 121–148 (2004)CrossRefGoogle Scholar
  6. 6.
    Robertson, S.E., Walker, S.: Okapi-Keenbow at TREC-8. In: Proceedings of the 8th Text Retrieval Conference TREC-8, vol. 500-246, pp. 151–162. NIST Special Publication (1999)Google Scholar
  7. 7.
    Calvé, A., Savoy, J.: Database merging strategy based on logistic regression. Information Processing & Management 36, 341–359 (2000)CrossRefGoogle Scholar
  8. 8.
    Savoy, J.: Cross-Language information retrieval: experiments based on CLEF 2000 corpora. Information Processing & Management 39, 75–115 (2003)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • F. Martínez-Santiago
    • 1
  • A. Montejo-Ráez
    • 1
  • M. A. García-Cumbreras
    • 1
  1. 1.SINAI Research Group, Computer Science DepartmentUniversity of JaénSpain

Personalised recommendations