Advertisement

UFRGS@CLEF2008: Using Association Rules for Cross-Language Information Retrieval

  • André Pinto Geraldo
  • Viviane P. Moreira
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5706)

Abstract

For UFRGS’s participation on the TEL task at CLEF2008, our aim was to assess the validity of using algorithms for mining association rules to find mappings between concepts on a Cross-Language Information Retrieval scenario. Our approach requires a sample of parallel documents to serve as the basis for the generation of the association rules. The results of the experiments show that the performance of our approach is not statistically different from the monolingual baseline in terms of mean average precision. This is an indication that association rules can be effectively used to map concepts between languages. We have also tested a modification to BM25 that aims at increasing the weight of rare terms. The results show that this modified version achieved better performance. The improvements were considered to be statistically significant in terms of MAP on our monolingual runs.

Keywords

association rules experimentation performance measurement 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aguirre, E., et al.: CLEF 2008: Ad Hoc Track Overview. In: Peters, C., et al. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 15–37. Springer, Heidelberg (2009)Google Scholar
  2. 2.
    Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. of the ACM SIGMOD Conference on Management of Data, Washington, D.C (1993)Google Scholar
  3. 3.
    Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proceedings of the 20th VLDB Conference, Santiago, Chile, pp. 487–499 (1994)Google Scholar
  4. 4.
    Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science 41(6), 1–13 (1990)CrossRefGoogle Scholar
  5. 5.
    Google Translator, http://www.google.com/translate_t (accessed on: February 8, 2009)
  6. 6.
    Hipp, J., Güntzer, U.: Is pushing constraints deeply into the mining algorithms really what we want?: an alternative approach for association rule mining. ACM SIGKDD Explorations Newsletter 4(1), 50–55 (2002)CrossRefGoogle Scholar
  7. 7.
    Porter, M.F.: An Algorithm for Suffix Stripping. Program 14(3), 130–137 (1980)CrossRefGoogle Scholar
  8. 8.
    Robertson, S., Walker, S.: Okapi at TREC-3. In: Proceedings of the Third Text REtrieval Conference (TREC). Gaithesburg, Maryland (1994)Google Scholar
  9. 9.
    Snowball. Spanish Stemmer, http://snowball.tartarus.org/algorithms/spanish/stemmer.html (retrieved August 08, 2008)
  10. 10.
    Veloso, A., Meira Jr., W., Gonçalves, M.A., Zaki, M.: Multi-label Lazy Associative Classification. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 605–612. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  11. 11.
    Zettair, www.seg.rmit.edu.au/zettair/ (retrieved 11/06/07, 2007)

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • André Pinto Geraldo
    • 1
  • Viviane P. Moreira
    • 1
  1. 1.Instituto de InformáticaUFRGSPorto AlegreBrazil

Personalised recommendations