Advertisement

Amharic-English Information Retrieval with Pseudo Relevance Feedback

  • Atelach Alemu Argaw
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5152)

Abstract

We describe cross language retrieval experiments using Amharic queries and English language d ocument collection. Two monolingual and eight bilingual runs were submitted with variations in terms of usage of long and short queries, presence of pseudo relevance feedback (PRF), and approaches for word sense disambiguation (WSD). We used an Amharic-English machine readable dictionary (MRD), and an online Amharic-English dictionary for lookup translation of query terms. Out of dictionary Amharic query terms were considered as possible named entities, and further filtering was attained through restricted fuzzy matching based on edit distance which is calculated against automatically extracted English proper names. The obtained results indicate that longer queries tend to perform similar to short ones, PRF improves performance considerably, and that queries tend to fare better with WSD rather than using maximal expansion of terms by taking all the translations given in the MRD.

Keywords

Edit Distance Query Term Stop Word Word Sense Disambiguation Stop Word Removal 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aklilu, A.: Amharic English Dictionary. Mega Publishing Enterprise, Ethiopia (1981)Google Scholar
  2. 2.
    Argaw, A.A., Asker, L.: An amharic stemmer: Reducing words to their citation forms. In: Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources, Prague, Czech Republic, June 2007, pp. 104–110. Association for Computational Linguistics (2007)Google Scholar
  3. 3.
    Argaw, A.A., Asker, L., Cöster, R., Karlgren, J., Sahlgren, M.: Dictionary-based amharic-french information retrieval. In: Peters, C., Gey, F.C., Gonzalo, J., Muller, H., Jones, G.J.F., Kluck, M., Magnini, B., de Rijke, M. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 83–92. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  4. 4.
    Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)Google Scholar
  5. 5.
    Shaalan, K., Raza, H.: Person name entity recognition for arabic. In: Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources, Prague, Czech Republic, June 2007, pp. 17–24. Association for Computational Linguistics (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Atelach Alemu Argaw
    • 1
  1. 1.Department of Computer and System SciencesStockholm University/KTH 

Personalised recommendations