Applying Light Natural Language Processing to Ad-Hoc Cross Language Information Retrieval

  • Christina Lioma
  • Craig Macdonald
  • Ben He
  • Vassilis Plachouras
  • Iadh Ounis
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4022)


In the CLEF 2005 Ad-Hoc Track we addressed the problem of retrieving information in morphologically rich languages, by experimenting with language-specific morphosyntactic processing and light Natural Language Processing (NLP). The diversity of the languages processed, namely Bulgarian, French, Italian, English, and Greek, allowed us to measure the effect of system-specific features upon the retrieval of these languages, and to juxtapose that effect to the role of language resources in Cross Language Information Retrieval (CLIR) in general.


Noun Phrase Natural Language Processing Machine Translation Retrieval Performance Query Expansion 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Amati, G.: Probabilistic Models for Information Retrieval based on Divergence from Randomness. PhD thesis, Dept of Computing Science, University of Glasgow (2003)Google Scholar
  2. 2.
    Aronson, H.I.: Bulgarian Inflectional Morphophonology. The Hague, Mouton (1968)Google Scholar
  3. 3.
    Babelfish Machine Translation,
  4. 4.
    Bauer, L.: Introducing Linguistic Morphology. Edinburgh University Press (1988)Google Scholar
  5. 5.
    Joseph, B., Philippaki-Warburton, I.: Modern Greek: A Linguist’s Grammar. In: Croom Helm (Lingua Descriptive Series), London (1987)Google Scholar
  6. 6.
    Lioma, C., He, B., Plachouras, V., Ounis, I.: The University of Glasgow at CLEF 2004: French Monolingual Information Retrieval with Terrier. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 253–259. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  7. 7.
    Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a Large Annotated Corpus for English: The Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)Google Scholar
  8. 8.
    Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Johnson, D.: Terrier Information Retrieval Platform. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 517–519. Springer, Heidelberg (2005), CrossRefGoogle Scholar
  9. 9.
    Robertson, S.E.: Okapi at TREC-3. In: Harman, D. K. (eds.): Overview of the Third Text Retrieval Conference (TREC-3), NIST (2005)Google Scholar
  10. 10.
    Schmidt, H.: Probabilistic Part-of-Speech Tagging Using Decision Trees. In: Jones, D., Somers, H. (eds.) New Methods in Language Processing Studies. Computational Linguistics, UCL Press (1997)Google Scholar
  11. 11.
    Skycode Machine Translation,
  12. 12.
    Snowball stemmers,
  13. 13.
    Worldlingo Machine Translation,
  14. 14.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Christina Lioma
    • 1
  • Craig Macdonald
    • 1
  • Ben He
    • 1
  • Vassilis Plachouras
    • 1
  • Iadh Ounis
    • 1
  1. 1.University of GlasgowUK

Personalised recommendations