Amharic-English Information Retrieval with Pseudo Relevance Feedback
We describe cross language retrieval experiments using Amharic queries and English language d ocument collection. Two monolingual and eight bilingual runs were submitted with variations in terms of usage of long and short queries, presence of pseudo relevance feedback (PRF), and approaches for word sense disambiguation (WSD). We used an Amharic-English machine readable dictionary (MRD), and an online Amharic-English dictionary for lookup translation of query terms. Out of dictionary Amharic query terms were considered as possible named entities, and further filtering was attained through restricted fuzzy matching based on edit distance which is calculated against automatically extracted English proper names. The obtained results indicate that longer queries tend to perform similar to short ones, PRF improves performance considerably, and that queries tend to fare better with WSD rather than using maximal expansion of terms by taking all the translations given in the MRD.
KeywordsEdit Distance Query Term Stop Word Word Sense Disambiguation Stop Word Removal
Unable to display preview. Download preview PDF.
- 1.Aklilu, A.: Amharic English Dictionary. Mega Publishing Enterprise, Ethiopia (1981)Google Scholar
- 2.Argaw, A.A., Asker, L.: An amharic stemmer: Reducing words to their citation forms. In: Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources, Prague, Czech Republic, June 2007, pp. 104–110. Association for Computational Linguistics (2007)Google Scholar
- 3.Argaw, A.A., Asker, L., Cöster, R., Karlgren, J., Sahlgren, M.: Dictionary-based amharic-french information retrieval. In: Peters, C., Gey, F.C., Gonzalo, J., Muller, H., Jones, G.J.F., Kluck, M., Magnini, B., de Rijke, M. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 83–92. Springer, Heidelberg (2006)CrossRefGoogle Scholar
- 4.Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)Google Scholar
- 5.Shaalan, K., Raza, H.: Person name entity recognition for arabic. In: Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources, Prague, Czech Republic, June 2007, pp. 17–24. Association for Computational Linguistics (2007)Google Scholar