Skip to main content
Log in

‘Irrefragable answers’ using comparable corpora to retrieve translation equivalents

  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

In this paper we present a tool that uses comparable corpora to find appropriate translation equivalents for expressions that are considered by translators as difficult. For a phrase in the source language the tool identifies a range of possible expressions used in similar contexts in target language corpora and presents them to the translator as a list of suggestions. In the paper we discuss the method and present results of human evaluation of the performance of the tool, which highlight its usefulness when dictionary solutions are lacking.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Babych, B., & Hartley, A. (2004). Extending the BLEU MT evaluation method with frequency weightings. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics. Barcelona.

  • Dagan, I., & Church, K. (1997). Termight: humans and machines in bilingual terminology acquisition. Machine Translation, 12(1/2), 89–107.

    Article  Google Scholar 

  • Daille, B., & Morin, E. (2005). French-English terminology extraction from comparable corpora. In Proceedings IJCNLP 2005: Second International Joint Conference, Lecture Notes in Computer Sciences (LNCS), Vol. 3651, pp. 707–719.

  • Grefenstette, G. (2002). Multilingual corpus-based extraction and the very large Lexicon. In L. Borin (Ed.), Language and computers, parallel corpora, parallel worlds (pp. 137–149). Rodopi.

  • Justeson, J. S., & Katz, S. M. (1995). Techninal terminology: Some linguistic properties and an algorithm for identification in text. Natural Language Engineering, 1(1), 9–27.

    Article  Google Scholar 

  • Lin, D. (1998). Automatic retrieval and clustering of similar words. In: Proceedings of Joint COLING-ACL-98 (pp. 768–774). Montreal.

  • Mel’čuk, I. A. (1996). Lexical functions: A tool for the description of lexical relations in a lexicon. In L. Wanner (Ed.), Lexical functions in lexicography and natural language processing (pp. 37–102). Amsterdam: John Benjamins.

  • Partington, A. (1998). Patterns and meanings: Using corpora for English language research and teaching. Amsterdam: John Benjamins.

    Google Scholar 

  • Ploux, S., & Ji, H. (2003). A model for matching semantic maps between languages (French/English, English/French). Computational Linguistics, 29(2), 155–178.

    Article  Google Scholar 

  • Rapp, R. (2004). A freely available automatically generated thesaurus of related words. In Proceedings of the Forth Language Resources and Evaluation Conference, LREC 2004 (pp. 395–398). Lisbon.

  • Rayson, P., Archer, D., Piao, S., & McEnery, T. (2004). The UCREL semantic analysis system. In: Proceedings of Beyond Named Entity Recognition Workshop in association with LREC 2004 (pp. 7–12). Lisbon.

  • Sharoff, S. (2006). Creating general-purpose corpora using automated search engine queries. In M. Baroni & S. Bernardini (Eds.), WaCky! Working papers on the Web as Corpus. Bologna: Gedit. http://www.wackybook.sslmit.unibo.it

  • Zanettin, F. (1998). Bilingual comparable corpora and the training of translators. Meta, XLIII(4).

Download references

Acknowledgements

This research is supported by EPSRC grant EP/C005902. We are grateful to the anonymous reviewers for their insightful comments and links to relevant research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Serge Sharoff.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sharoff, S., Babych, B. & Hartley, A. ‘Irrefragable answers’ using comparable corpora to retrieve translation equivalents. Lang Resources & Evaluation 43, 15–25 (2009). https://doi.org/10.1007/s10579-007-9046-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-007-9046-4

Keywords

Navigation