Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3491))

Included in the following conference series:

Abstract

This experiment tests a simple, scalable, and effective approach to building a domain-specific translation lexicon using distributional statistics over parallellized bilingual corpora. A bilingual lexicon is extracted from aligned Swedish-French data, used to translate CLEF topics from Swedish to French, which resulting French queries are then in turn used to retrieve documents from the French language CLEF collection. The results give 34 of fifty queries on or above median for the “precision at 1000 documents” recall oriented score; with many of the errors possible to handle by the use of string-matching and cognate search. We conclude that the approach presented here is a simple and efficient component in an automatic query translation system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Karlgren, H.: Term-tuning, a method for the computer-aided revision of multi-lingual texts. International Forum for Information and Documentation 13, 7–13 (1988)

    Google Scholar 

  2. Melamed, D.: Models of translational equivalence among words. Computational Linguistics 26, 221–249 (2000)

    Article  Google Scholar 

  3. Brown, P., Cocke, S., Della Pietra, V., Della Pietra, F., Jelinek, F., Mercer, R., Roossin, P.: A statistical approach to language translation. In: Proceedings of the 12th Annual Conference on Computational Linguistics (COLING 1988), International Committee on Computational Linguistics (1988)

    Google Scholar 

  4. Kanerva, P., Kristofersson, J., Holst, A.: Random indexing of text samples for latent semantic analysis. In: Proceedings of the 22nd Annual Conference of the Cognitive Science Society, Erlbaum, p. 1036 (2000)

    Google Scholar 

  5. Karlgren, J., Sahlgren, M.: From words to understanding. In: Uesaka, Y., Kanerva, P., Asoh, H. (eds.) Foundations of Real-World Intelligence, pp. 294–308. CSLI Publications, Stanford (2001)

    Google Scholar 

  6. Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. Journal of the Society for Information Science 41, 391–407 (1990)

    Article  Google Scholar 

  7. Landauer, T., Dumais, S.: A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review 104, 211–240 (1997)

    Article  Google Scholar 

  8. Sahlgren, M.: Automatic bilingual lexicon acquisition using random indexing of aligned bilingual data. In: Proceedings of the fourth international conference on Language Resources and Evaluation, LREC 2004 (2004)

    Google Scholar 

  9. Sahlgren, M., Karlgren, J.: Automatic bilingual lexicon acquisition using random indexing of parallel corpora. Natural Language Engineering (forthcoming)

    Google Scholar 

  10. Koehn, P.: Europarl: A multilingual corpus for evaluation of machine translation (2002), http://people.csail.mit.edu/people/koehn/publications/europarl/

  11. Sahlgren, M., Karlgren, J., Cöster, R., Järvinen, T.: Automatic query expansion using random indexing. In: Peters, C., Braschler, M., Gonzalo, J. (eds.) CLEF 2002. LNCS, vol. 2785, pp. 311–320. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Karlgren, J., Sahlgren, M., Järvinen, T., Cöster, R. (2005). Dynamic Lexica for Query Translation. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds) Multilingual Information Access for Text, Speech and Images. CLEF 2004. Lecture Notes in Computer Science, vol 3491. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11519645_15

Download citation

  • DOI: https://doi.org/10.1007/11519645_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27420-9

  • Online ISBN: 978-3-540-32051-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics