Advertisement

Dictionary-Based Amharic-French Information Retrieval

  • Atelach Alemu Argaw
  • Lars Asker
  • Rickard Cöster
  • Jussi Karlgren
  • Magnus Sahlgren
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4022)

Abstract

We present four approaches to the Amharic – French bilingual track at CLEF 2005. All experiments use a dictionary based approach to translate the Amharic queries into French Bags-of-words, but while one approach uses word sense discrimination on the translated side of the queries, the other one includes all senses of a translated word in the query for searching. We used two search engines: The SICS experimental engine and Lucene, hence four runs with the two approaches. Non-content bearing words were removed both before and after the dictionary lookup. TF/IDF values supplemented by a heuristic function was used to remove the stop words from the Amharic queries and two French stopwords lists were used to remove them from the French translations. In our experiments, we found that the SICS search engine performs better than Lucene and that using the word sense discriminated keywords produce a slightly better result than the full set of non discriminated keywords.

Keywords

Search Engine Query Term Word Sense Stop Word Stop Word Removal 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abebe, B.: Dictionnaire Amharique-FrancaisGoogle Scholar
  2. 2.
    Aklilu, A.: Amharic English DictionaryGoogle Scholar
  3. 3.
    Bender, M.L., Head, S.W., Cowley, R.: The ethiopian writing systemGoogle Scholar
  4. 4.
    Gale, W., Church, K., Yarowsky, D.: One sense per discourse. In: The 4th DARPA Speech and Language Workshop (1992)Google Scholar
  5. 5.
    Leslau, W.: Amharic Textbook. Berkeley University, Berkeley, California (1968)Google Scholar
  6. 6.
    Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Cybernetics and Control Theory 10, 707–710 (1966)MathSciNetGoogle Scholar
  7. 7.
    Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)MATHGoogle Scholar
  8. 8.
    Sahlgren, M., Karlgren, J., Cöster, R., Järvinen, T.: SICS at CLEF 2002: Automatic query expansion using random indexing. In: Peters, C., Braschler, M., Gonzalo, J. (eds.) CLEF 2002. LNCS, vol. 2785, Springer, Heidelberg (2003)CrossRefGoogle Scholar
  9. 9.
    Singhal, A., Buckley, C., Mitra, M.: Pivoted document length normalization. In: Proceedings of the 19th annual international ACM SIGIR conference on Research and Development in Information Retrieval, pp. 21–29 (1996)Google Scholar
  10. 10.
  11. 11.
  12. 12.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Atelach Alemu Argaw
    • 1
  • Lars Asker
    • 1
  • Rickard Cöster
    • 2
  • Jussi Karlgren
    • 2
  • Magnus Sahlgren
    • 2
  1. 1.Department of Computer and Systems SciencesStockholm University/KTH 
  2. 2.Swedish Institute of Computer Science (SICS) 

Personalised recommendations