‘Irrefragable answers’ using comparable corpora to retrieve translation equivalents
Rent the article at a discountRent now
* Final gross prices may vary according to local VAT.Get Access
In this paper we present a tool that uses comparable corpora to find appropriate translation equivalents for expressions that are considered by translators as difficult. For a phrase in the source language the tool identifies a range of possible expressions used in similar contexts in target language corpora and presents them to the translator as a list of suggestions. In the paper we discuss the method and present results of human evaluation of the performance of the tool, which highlight its usefulness when dictionary solutions are lacking.
- Babych, B., & Hartley, A. (2004). Extending the BLEU MT evaluation method with frequency weightings. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics. Barcelona.
- Dagan, I., & Church, K. (1997). Termight: humans and machines in bilingual terminology acquisition. Machine Translation, 12(1/2), 89–107. CrossRef
- Daille, B., & Morin, E. (2005). French-English terminology extraction from comparable corpora. In Proceedings IJCNLP 2005: Second International Joint Conference, Lecture Notes in Computer Sciences (LNCS), Vol. 3651, pp. 707–719.
- Grefenstette, G. (2002). Multilingual corpus-based extraction and the very large Lexicon. In L. Borin (Ed.), Language and computers, parallel corpora, parallel worlds (pp. 137–149). Rodopi.
- Justeson, J. S., & Katz, S. M. (1995). Techninal terminology: Some linguistic properties and an algorithm for identification in text. Natural Language Engineering, 1(1), 9–27. CrossRef
- Lin, D. (1998). Automatic retrieval and clustering of similar words. In: Proceedings of Joint COLING-ACL-98 (pp. 768–774). Montreal.
- Mel’čuk, I. A. (1996). Lexical functions: A tool for the description of lexical relations in a lexicon. In L. Wanner (Ed.), Lexical functions in lexicography and natural language processing (pp. 37–102). Amsterdam: John Benjamins.
- Partington, A. (1998). Patterns and meanings: Using corpora for English language research and teaching. Amsterdam: John Benjamins.
- Ploux, S., & Ji, H. (2003). A model for matching semantic maps between languages (French/English, English/French). Computational Linguistics, 29(2), 155–178. CrossRef
- Rapp, R. (2004). A freely available automatically generated thesaurus of related words. In Proceedings of the Forth Language Resources and Evaluation Conference, LREC 2004 (pp. 395–398). Lisbon.
- Rayson, P., Archer, D., Piao, S., & McEnery, T. (2004). The UCREL semantic analysis system. In: Proceedings of Beyond Named Entity Recognition Workshop in association with LREC 2004 (pp. 7–12). Lisbon.
- Sharoff, S. (2006). Creating general-purpose corpora using automated search engine queries. In M. Baroni & S. Bernardini (Eds.), WaCky! Working papers on the Web as Corpus. Bologna: Gedit. http://www.wackybook.sslmit.unibo.it
- Zanettin, F. (1998). Bilingual comparable corpora and the training of translators. Meta, XLIII(4).
- ‘Irrefragable answers’ using comparable corpora to retrieve translation equivalents
Language Resources and Evaluation
Volume 43, Issue 1 , pp 15-25
- Cover Date
- Print ISSN
- Online ISSN
- Springer Netherlands
- Additional Links
- Large comparable corpora
- Translation equivalents
- Multiword expressions
- Distributional similarity
- Industry Sectors