Advertisement

Machine Translation

, Volume 20, Issue 1, pp 1–23 | Cite as

EBMT by tree-phrasing

  • Philippe LanglaisEmail author
  • Fabrizio Gotti
Original Paper

Abstract

This article presents an attempt to build a repository storing associations between simple syntactic dependency treelets in a source language and their corresponding phrases in a target language. We assess the usefulness of this resource in two different settings. First, we show that it improves upon a standard subsentential translation memory. Second, we observe improvements in translation quality when a standard statistical phrase-based translation engine is augmented with the ability to exploit such a repository.

Keywords

Example-based machine translation Translation memory Statistical phrase-based machine translation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bertoldi N, Cattoni R, Cettolo M, Federico M (2004) The ITC-irst statistical machine translation system for IWSLT-2004. In: International workshop on spoken language translation, Kyoto, Japan, pp 51–58Google Scholar
  2. Bourigault D, Fabre C (2000) Approche linguistique pour l’analyse syntaxique de corpus [A linguistic approach to the syntactic corpus analysis]. Cah Gramm 25:131–151Google Scholar
  3. Brown PE, Della Pietra VJ, Della Pietra SA, Mercer RL (1993) The mathematics of statistical machine translation: Parameter estimation. Comput Ling 19:263–311Google Scholar
  4. Brown RD (1996) Example-based machine translation in the Pangloss system. In: COLING-96: Proceedings of the 16th international conference on computational linguistics, Copenhagen, Denmark, pp 169–174Google Scholar
  5. Chiang D (2005) A hierarchical phrase-based model for statistical machine translation. In: 43rd annual meeting of the Association for Computational Linguistics, Ann Arbor, MI, pp 263–270Google Scholar
  6. Ding Y, Palmer M (2004) Automatic learning of parallel dependency treelet pairs. In: IJCNLP-04, first international joint conference on natural language processing, Sanya, Hainan Island, China, pp 30–37Google Scholar
  7. Ding Y, Palmer M (2005) Machine translation using probabilistic synchronous dependency insertion grammars. In: 43rd annual meeting of the Association for Computational Linguistics, Ann Arbor, MI, pp 541–548Google Scholar
  8. Gildea D (2003) Loosely tree-based alignment for machine translation. In: 41st annual meeting of the Association for Computational Linguistics, Sapporo, Japan, pp 80–87Google Scholar
  9. Gotti F, Langlais P, Macklovitch E, Bourigault D, Robichaud B, Coulombe C (2005) 3GTM: A third-generation translation memory. In: CLiNE 05 3rd computational linguistics in the north-east workshop, Gatineau, Québec, Canada, http://www.crtl.ca/cline05Google Scholar
  10. Graehl J, Knight K (2004) Training tree transducers. In: Proceedings of the joint human language technology conference and the annual meeting of the North American chapter of the Association for Computational Linguistics, Boston, MA, pp 105–112Google Scholar
  11. Groves D, Way A (2006) Hybrid data-driven models of machine translation. Mach Translat 19:299–321Google Scholar
  12. Hearne M, Way A (2003) Seeing the wood for the trees: Data-oriented translation. In: MT summit IX: Proceedings of the ninth machine translation summit, New Orleans, USA, pp 165–172Google Scholar
  13. Hildebrand AS, Eck M, Vogel S, Waibel A (2005) Adaptation of the translation model for statistical machine translation based on information retrieval. In: Proceedings of the 10th annual meeting of the European Association for Machine Translation, Budapest, Hungary, pp 133–142Google Scholar
  14. Koehn P (2004) Pharaoh: A beam search decoder for phrase-based statistical machine translation models. In: Frederking RE, Taylor KB (eds) Machine translation: From real users to research; 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, USA, September/October 2004, Springer, Berlin, Germany, pp 115–124Google Scholar
  15. Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In HLT-NAACL: Human language technology conference of the North American Chapter of the Association for Computational Linguistics, Edmonton, Alberta, Canada, pp 127–133Google Scholar
  16. Langlais P, Simard M (2003) De la traduction probabiliste aux mémoires de traduction (ou l’inverse) [From statistical translation to translation memory (or vice versa)]. In: TALN 2003: Traitement automatique des langues naturelles VVF, Batz-sur-Mer, France, pp 195–204Google Scholar
  17. Matusov E, Kanthak S, Ney H (2005) Efficient statistical machine translation with constraint reordering. In: Proceedings of the 10th annual meeting of the European Association for Machine Translation, Budapest, Hungary, pp 181–188Google Scholar
  18. Melamed ID (2004) Statistical machine translation by parsing. In: 42nd annual meeting of the Association for Computational Linguistics, Barcelona, Spain, pp 653–660Google Scholar
  19. Menezes A, Quirk C (2005) Dependency treelet translation: The convergence of statistical and example-based machine-translation? In: MT summit X workshop: Second workshop on example-based machine translation, Phuket, Thailand, pp 99–108Google Scholar
  20. Och FJ, Ney H (2000) Improved statistical alignment models. In: 38th annual meeting of the Association for Computational Linguistics, Hong Kong, China, pp 440–447Google Scholar
  21. Och FJ, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: 40th annual meeting of the Association for Computational Linguistics, Philadelphia, PA, pp 295–302Google Scholar
  22. Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Ling 29:19–51CrossRefGoogle Scholar
  23. Ortiz-Martínez D, Garcìa-Varea I, Casacuberta F (2005) \({\mathsf{Thot}}\) : A toolkit to train phrase-based statistical translation models. In: The tenth machine translation summit, Phuket, Thailand, pp 141–148Google Scholar
  24. Papineni K, Roukos S, Ward T, Zhu W-J (2002) Bleu: A method for automatic evaluation of machine translation. In: 40th annual meeting of the Association for Computational Linguistics, Philadelphia, PA, pp 311–318Google Scholar
  25. Planas E (2000) Extending translation memories. In: Fifth EAMT workshop “Harvesting existing resources”, Ljubljana, Slovenia [no page numbers]Google Scholar
  26. Poutsma A (2000) Data-oriented translation. In: Proceedings of the 18th international conference on computational linguistics: COLING 2000 in Europe, Saarbrücken, Germany, pp 635–641Google Scholar
  27. Press WH, Teukolsky SA, Vetterling WT, Flannery BP (2002) Numerical recipes in C++. The art of scientific programming. Cambridge University Press, Cambridge, UKGoogle Scholar
  28. Quirk C, Menezes A (2006) Dependency treelet translation: The convergence of statistical and example-based machine-translation? Mach Translat 20:45–66Google Scholar
  29. Quirk C, Menezes A, Cherry C (2005) Dependency treelet translation: Syntactically informed phrasal SMT. In: 43rd annual meeting of the Association for Computational Linguistics, Ann Arbor, MI, pp 271–279Google Scholar
  30. Schmid H (1994) Probabilistic part-of-speech tagging using decision trees. In: International conference on new methods in language processing (NeMLaP), Manchester, UK, pp 44–49; repr. in Jones D, Somers H (eds) New methods in language processing, UCL Press, London (1997), pp 154–164Google Scholar
  31. Simard M, Cancedda N, Cavestro B, Dymetman M, Gaussier E, Goutte C, Yamada K, Langlais P, Mauser A (2005) Translating with non-contiguous phrases. In: Proceedings of human language technology conference and conference on empirical methods in natural language processing, Vancouver, British Columbia, Canada, pp 755–762Google Scholar
  32. Simard M, Langlais P (2001) Sub-sentential exploitation of translation memories. In: MT summit VIII: Machine translation in the information age, Santiago de Compostela, Spain, pp 335–339Google Scholar
  33. Stolcke A (2002) SRILM—An extensible language modeling toolkit. In: 7th international conference on spoken language processing (ICSLP2002 – Interspeech 2002), Denver, CO, pp 901–904Google Scholar
  34. Zens R, Ney H (2004) Improvements in phrase-based statistical machine translation. In: Proceedings of the human language technology conference and the North American Chapter of the Association for Computational Linguistics, Boston, MA, pp 257–264Google Scholar

Copyright information

© Springer Science+Business Media 2006

Authors and Affiliations

  1. 1.RALI-DIROUniversité de MontréalMontréalCanada

Personalised recommendations