Advertisement

Evaluating Cross-Language Explicit Semantic Analysis and Cross Querying

  • Maik Anderka
  • Nedim Lipka
  • Benno Stein
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6241)

Abstract

This paper describes our participation in the TEL@CLEF task of the CLEF 2009 ad-hoc track. The task is to retrieve items from various multilingual collections of library catalog records, which are relevant to a user’s query. Two different strategies are employed: (i)  the Cross-Language Explicit Semantic Analysis, CL-ESA, where the library catalog records and the queries are represented in a multilingual concept space that is spanned by aligned Wikipedia articles, and, (ii)  a Cross Querying approach, where a query is translated into all target languages using Google Translate and where the obtained rankings are combined. The evaluation shows that both strategies outperform the monolingual baseline and achieve comparable results.

Furthermore, inspired by the Generalized Vector Space Model we present a formal definition and an alternative interpretation of the CL-ESA model. This interpretation is interesting for real-world retrieval applications since it reveals how the computational effort for CL-ESA can be shifted from the query phase to a preprocessing phase.

Keywords

Cosine Similarity Preprocessing Phase Main Language Index Document Term Weighting Schema 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Anderka, M., Stein, B.: The ESA Retrieval Model Revisited. In: Proc. of SIGIR 2009, pp. 670–671 (2009)Google Scholar
  2. 2.
    Gabrilovich, E., Markovitch, S.: Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. In: Proc. of IJCAI 2007, pp. 1606–1611 (2007)Google Scholar
  3. 3.
    Potthast, M., Stein, B., Anderka, M.: A Wikipedia-Based Multilingual Retrieval Model. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 522–530. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  4. 4.
    Sorg, P., Cimiano, P.: Cross-lingual Information Retrieval with Explicit Semantic Analysis. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 243–250. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Wong, S.K.M., Ziarko, W., Wong, P.C.N.: Generalized Vector Spaces Model in Information Retrieval. In: Proc. of SIGIR 1985, pp. 18–25 (1985)Google Scholar
  6. 6.
    Yang, Y., Carbonell, J.G., Brown, R.D., Frederking, R.E.: Translingual Information Retrieval: Learning from Bilingual Corpora. Artif. Intell. 103(1-2), 323–345 (1998)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Maik Anderka
    • 1
  • Nedim Lipka
    • 1
  • Benno Stein
    • 1
  1. 1.Bauhaus-Universität WeimarWeimarGermany

Personalised recommendations