Scalable Multilingual Information Access

  • Paul McNamee
  • James Mayfield
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2785)

Abstract

The third Cross-Language Evaluation Forum workshop (CLEF-2002) provides the unprecedented opportunity to evaluate retrieval in eight different languages using a common set of topics and a uniform assessment methodology. This year the Johns Hopkins University Applied Physics Laboratory participated in the monolingual, bilingual, and multilingual retrieval tasks. We contend that information access in a plethora of languages requires approaches that are inexpensive in developer and run-time costs. In this paper we describe a simplified approach that seems suitable for retrieval in many languages; we also show how good retrieval is possible over many languages, even when translation resources are scarce, or when query-time translation is infeasible. In particular, we investigate the use of character n-grams for monolingual retrieval, CLIR between related languages using partial morphological matches, and translation of document representations to an interlingua for computationally efficient retrieval against multiple languages.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    M. Braschler and C. Peters, ‘CLEF-2002: Methodology and Metrics’, in this volume.Google Scholar
  2. [2]
    C. Buckley, M. Mitra, J. Walz, and C. Cardie, ‘Using Clustering and Super Concepts within SMART: TREC-6’. In E. Voorhees and D. Harman (eds.), Proceedings of the Sixth Text REtrieval Conference (TREC-6), NIST Special Publication 500-240, 1998.Google Scholar
  3. [3]
    F. Gey, H. Jiang, A. Chen, and R. Larson, ‘Manual Queries and Machine Translation in Cross-language Retrieval and Interactive Retrieval with Cheshire II at TREC-7’. In E. M. Voorhees and D. K. Harman, eds., Proceedings of the Seventh Text REtrieval Conference (TREC-7), pp. 527-540, 1999.Google Scholar
  4. [4]
    W. Kraaij, ‘TNO at CLEF-2001: Comparing Translation Resources.’ In Carol Peters, Martin Braschler, Julio Gonzalo, and Michael Kluck (eds.), Evaluation of Cross-Language Information Retrieval Systems: Proceedings of the CLEF 2001 Workshop, Lecture Notes in Computer Science 2406, Springer, pp. 78-93, 2001.Google Scholar
  5. [5]
    P. McNamee and J. Mayfield, ‘JHU/APL Experiments at CLEF: Translation Resources and Score Normalization’. In Carol Peters, Martin Braschler, Julio Gonzalo, and Michael Kluck (eds.), Evaluation of Cross-Language Information Retrieval Systems: Proceedings of the CLEF 2001 Workshop, Lecture Notes in Computer Science 2406, Springer, pp. 193-208, 2001.Google Scholar
  6. [6]
    Paul McNamee and James Mayfield, ‘Comparing Cross-Language Query Expansion Techniques by Degrading Translation Resources’. In the Proceedings of the 25th Annual International Conference on Research and Development in Information Retrieval (SIGIR-2002), Tampere, Finland, August 2002.Google Scholar
  7. [7]
    E. M. Voorhees, ‘The Philosophy of Information Retrieval Evaluation.’ In Carol Peters, Martin Braschler, Julio Gonzalo, and Michael Kluck (eds.), Evaluation of Cross-Language Information Retrieval Systems: Proceedings of the CLEF 2001 Workshop, Lecture Notes in Computer Science 2406, Springer, pp. 355-370, 2001.Google Scholar
  8. [8]
  9. [9]

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Paul McNamee
    • 1
  • James Mayfield
    • 1
  1. 1.Applied Physics LabJohns Hopkins UniversityLaurelUSA

Personalised recommendations