Machine Translation

, Volume 24, Issue 3–4, pp 177–208 | Cite as

Panning for EBMT gold, or “Remembering not to forget”

Article

Abstract

A very useful service to the example-based machine translation (EBMT) community was provided by Harold Somers in his summary article which appeared in 1999, and was extended in our 2003 book Recent advances in example-based machine translation. As well as providing a comprehensive review of the paradigm, Somers gives a categorisation of the different instantiations of the basic model. In this paper, we provide a complementary view to that of Somers. Today’s EBMT systems learn by analogy. Perhaps even more so than statistical models of translation, one might view these systems as being incapable of forgetting. We researchers and system developers, on the other hand, often forget or are ignorant of techniques and models presented in prior research. The primary aim of this paper is to try to ensure that golden nuggets from past (now quite distantly so) EBMT research papers are gathered together and presented here for a new generation of researchers keen to operate in the paradigm, especially given the spate of recent open-source releases of EBMT systems. We revisit the findings of the previous main research papers, relate them to some of the major research efforts which have taken place since then, and examine especially the prophecies given in the older pieces of work to see the extent to which they have been borne out in the newer research. Given the strong convergence between the leading corpus-based approaches to MT, especially since the introduction of phrase-based statistical MT, a further hope is that these findings may also prove useful to researchers and developers in other areas of MT.

Keywords

Example-based machine translation Statistical machine translation Historical development Convergence between paradigms Scalability Preprocessing Retrieval Search Adaptation Alignment Recombination Decoding Postprocessing String-based models Tree-based models Hybrid models Word alignment Phrase alignment Subtree alignment Generalized templates 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alegria I, Casillas A, Díaz de Ilarraza A, Igartua J, Labaka G, Lersundi M, Mayor A, Sarasola K, Saralegi X, Laskurain B (2008) Mixing approaches to MT for Basque: selecting the best output from RBMT, EBMT and SMT. In: MATMT2008 workshop on mixing approaches to machine translation, proceedings, Donostia-San Sebastian, Spain, pp 27–34Google Scholar
  2. Alegria I, Díaz de Ilarraza A, Labaka G, Lersundi M, Mayor A, Sarasola K, Forcada M, Ortiz S, Padró L (2005) An open architecture for transfer-based machine translation between Spanish and Basque. In: Proceedings of the workshop on open source machine translation, MT summit X, Phuket, Thailand, pp 7–14Google Scholar
  3. Almuallim H, Akiba Y, Yamazaki T, Yokoo A, Kaneda S (1994) Two methods for learning ALT-J/E translation rules from examples and a semantic hierarchy. In: COLING 94: the 15th international conference on computational linguistics, Kyoto, Japan, pp 57–63Google Scholar
  4. Andriamanankasina T, Araki K, Tochinai K (2003) EBMT of POS-tagged sentences by recursive division via inductive learning. In: Carl and Way (2003) pp 225–252Google Scholar
  5. Aramaki E, Kurohashi S (2004) Example-based machine translation using structural translation examples. In: Proceedings of the international workshop on spoken language translation (IWSLT-04), Kyoto, Japan, pp 91–94Google Scholar
  6. Aramaki E, Kurohashi S, Kashioka H, Kato N (2005) Probabilistic model for example-based machine translation. In: MT summit X, the tenth machine translation summit, Phuket, Thailand, pp 219–226Google Scholar
  7. Armstrong S (2007) Using EBMT to produce foreign language subtitles. MSc Thesis, Dublin City University, Dublin, IrelandGoogle Scholar
  8. Bach N, Eck M, Charoenpornsawat P, Köhler T, Stüker S, Nguyen T, Hsiao R, Waibel A, Vogel S, Schultz T, Black A (2007) The CMU TransTac 2007 eyes-free and hands-free two-way speech-to-speech translation system. In: IWSLT 2007, Proceedings of the 4th international workshop on spoken language translation, Trento, Italy, pp 29–36Google Scholar
  9. Badia T, Boleda G, Melero M, Oliver A (2005) An n-gram approach to exploiting a monolingual corpus for machine translation. In: Proceedings of the second workshop on example-based machine translation, MT summit X, Phuket, Thailand, pp 1–7Google Scholar
  10. Baldwin T, Joseph M (2009) Restoring punctuation and casing in English text. In: Nicholson A, Li X (eds) AI 2009: advances in artificial intelligence, 22nd Australasian joint conference, Melbourne, Australia. LNCS 5866, Springer, Berlin, pp 547–556Google Scholar
  11. Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Intrinsic and extrinsic evaluation measures for summarization, proceedings of the ACL-05 workshop, Ann Arbor, MI, pp 65–73Google Scholar
  12. Bikel D (2004) A distributional analysis of a lexicalized statistical parsing model. In: Proceedings of the 2004 conference on empirical methods in natural language processing (EMNLP 2004), Barcelona, Spain, pp 182–189Google Scholar
  13. Bond F, Shirai S (2003) A hybrid rule and example-based method for machine translation. In: Carl and Way (2003) pp 211–224Google Scholar
  14. Brown PF, Cocke J, Della Pietra SA, Della Pietra VJ, Jelinek F, Lafferty JD, Mercer RL, Roossin PS (1990) A statistical approach to machine translation. Comput Linguist 16: 79–85Google Scholar
  15. Brown P, Cocke J, Della Pietra S, Della Pietra V, Jelinek F, Mercer R, Roossin P (1988) A statistical approach to French/English translation. In: Second international conference on theoretical and methodological issues in machine translation of natural languages, Pittsburgh, PA (pages not numbered)Google Scholar
  16. Brown PF, Della Pietra SA, Della Pietra VJ, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19: 263–311Google Scholar
  17. Brown RD (1996) Example-based machine translation in the Pangloss system. In: COLING-96: the 16th international conference on computational linguistics, Copenhagen, Denmark, pp 169–174Google Scholar
  18. Brown RD (1997) Automated dictionary extraction for ‘knowledge-free’ example-based translation. In: Proceedings of the 7th international conference on theoretical and methodological issues in machine translation, Santa Fe, NM, pp 111–118Google Scholar
  19. Brown RD (1999) Adding linguistic knowledge to a lexical example-based translation system. In: Proceedings of the 8th international conference on theoretical and methodological issues in machine translation (TMI 99), Chester, UK, pp 22–32Google Scholar
  20. Brown RD (2000) Automated generalization of translation examples. In: Proceedings of the 18th international conference on computational linguistics: COLING 2000 in Europe, Saarbrücken, Germany, pp 125–131Google Scholar
  21. Brown RD (2003) Clustered transfer rule induction for example-based translation. In: Carl and Way (2003) pp 287–305Google Scholar
  22. Brown RD, Hutchinson R, Bennett P, Carbonell J, Jansen P (2003) Reducing boundary friction using translation-fragment overlap. In: MT summit IX: proceedings of the ninth machine translation summit, New Orleans, USA, pp 24–31Google Scholar
  23. Callison-Burch C, Bannard C, Schroeder J (2005) A compact data structure for searchable translation memories. In: 10th EAMT conference, practical applications of machine translation, proceedings, Budapest, Hungary, pp 59–65Google Scholar
  24. Carl M (2007) METIS-II: the German to English MT system. In: MT summit XI, proceedings, Copenhagen, Denmark, pp 65–72Google Scholar
  25. Carl M, Hansen S (1999) Linking translation memories with example-based machine translation. In: Machine translation summit VII, Singapore, pp 617–624Google Scholar
  26. Carl M, Schmidt P, Schütz J (2005) Reversible template-based shake & bake generation. In: Proceedings of the second workshop on example-based machine translation, MT summit X, Phuket, Thailand, pp 17–25Google Scholar
  27. Carl, M, Way, A (eds) (2003) Recent advances in example-based machine translation. Kluwer, DordrechtMATHGoogle Scholar
  28. Carpuat M, Wu D (2005) Word sense disambiguation vs. statistical machine translation. In: 43rd annual meeting of the Association for Computational Linguistics (ACL-2005), Ann Arbor, MI, pp 387–394Google Scholar
  29. Carpuat M, Wu D (2007) How phrase sense disambiguation outperforms word sense disambiguation for statistical machine translation. In: TMI 2007: proceedings of the 11th international conference on theoretical and methodological issues in machine translation, Skövde, [Sweden], pp 43–52Google Scholar
  30. Chiang D (2005) A hierarchical phrase-based model for statistical machine translation. In: 43rd annual meeting of the Association for Computational Linguistics (ACL’05), Ann Arbor, MI, pp 263–270Google Scholar
  31. Christensen H, Gotoh Y, Renals S (2001) Punctuation annotation using statistical prosody models. In: Proceedings of the ISCA workshop on prosody in speech recognition and understanding, Red Bank, NJ, pp 35–40Google Scholar
  32. Cicekli I (2005) Learning translation templates with type constraints. In: Proceedings of the second workshop on example-based machine translation, MT summit X, Phuket, Thailand, pp 27–33Google Scholar
  33. Cicekli I, Güvenir HA (2003) Learning translation templates from bilingual translation examples. In: Carl and Way (2003) pp 255–286Google Scholar
  34. Civit M, Martí MA (2004) Building Cast3LB: a Spanish treebank. Res Lang Comput 2(4): 549–574CrossRefGoogle Scholar
  35. Collins B, Somers H (2003) EBMT seen as case-based reasoning. In: Carl and Way (2003) pp 115–153Google Scholar
  36. Cranias L, Papageorgiou H, Piperidis S (1994) A matching technique in example-based machine translation. In: COLING 94: the 15th international conference on computational linguistics, Kyoto, Japan, pp 100–104Google Scholar
  37. Deng Y, Byrne W (2005) HMM word and phrase alignment for statistical machine translation. In: Proceedings of human language technology conference and conference on empirical methods in natural language processing, Vancouver, BC, Canada, pp 169–176Google Scholar
  38. Denoual E (2005) The influence of example-data: homogeneity on EBMT quality. In: Proceedings of the second workshop on example-based machine translation, MT summit X, Phuket, Thailand, pp 35–42Google Scholar
  39. Dong Z, Dong Q (2006) HowNet and the computation of meaning. World Scientific, River Edge, NJCrossRefGoogle Scholar
  40. Du J, He Y, Penkale S, Way A (2009) MaTrEx: the DCU MT system for WMT 2009. In: Proceedings of the fourth workshop on statistical machine translation, EACL 2009, Athens, Greece, pp 95–99Google Scholar
  41. Eck M, Vogel S, Waibel A (2005) Low cost portability for statistical machine translation based on n-gram coverage. In: MT summit X, the tenth machine translation summit, Phuket, Thailand, pp 227–234Google Scholar
  42. Flanagan M (2009) Using example-based machine translation to translate DVD Subtitles. In: Proceedings of the 3rd international workshop on example-based machine translation, Dublin, Ireland, pp 85–92Google Scholar
  43. Forcada ML (2009) Foreword. In: Proceedings of the 3rd international workshop on example-based machine translation, Dublin, Ireland, pp i–iiGoogle Scholar
  44. Fraser A, Marcu D (2007) Getting the structure right for word alignment: LEAF. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), Prague, Czech Republic, pp 51–60Google Scholar
  45. Frederking R, Brown R (1996) The Pangloss-Lite machine translation system. In: Expanding MT horizons: proceedings of the second conference of the Association for Machine Translation in the Americas, Montreal, Canada, pp 268–272Google Scholar
  46. Fügen C, Waibel A, Kolss M (2007) Simultaneous translation of lectures and speeches. Mach Transl 21: 209–252CrossRefGoogle Scholar
  47. Furuse O, Iida H (1992) An example-based method for transfer-driven machine translation. In: Fourth international conference on theoretical and methodological issues in machine translation: empiricist vs. rationalist methods in MT, TMI-92, Montréal, Canada, pp 139–150Google Scholar
  48. Galley M, Graehl J, Knight K, Marcu D, DeNeefe S, Wang W, Thayer I (2006) Scalable inference and training of context-rich syntactic translation models. In: Coling-ACL 2006: Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the Association for Computational Linguistics, Sydney, Australia, pp 961–968Google Scholar
  49. Gangadharaiah R, Brown R, Carbonell J (2006) Spectral clustering for example based machine translation. In: HLT-NAACL 2006: human language technology conference of the North American chapter of the Association for Computational Linguistics, short papers, New York, NY, pp 41–44Google Scholar
  50. Germann U (2003) Greedy decoding for statistical machine translation in almost linear time. In: HLT-NAACL: human language technology conference of the North American chapter of the Association for Computational Linguistics, Edmonton, Alberta, Canada, pp 72–79Google Scholar
  51. Gotoh Y, Renals S (2000) Sentence boundary detection in broadcast speech transcripts. In: Proceedings of the ISCA workshop on automatic speech recognition: challenges for the new millennium (ASR-2000), Paris, France, pp 228–235Google Scholar
  52. Gough N, Way A (2004) Robust large-scale EBMT with marker-based segmentation. In: Proceedings of the tenth conference on theoretical and methodological issues in machine translation (TMI-04), Baltimore, MD, pp 95–104Google Scholar
  53. Groves D (2007) Hybrid data-driven models of machine translation. PhD Thesis, Dublin City University, Dublin, IrelandGoogle Scholar
  54. Groves D, Way A (2005a) Hybrid example-based SMT: the best of both worlds? In: proceedings of the workshop, ACL–05 building and using parallel texts: data-driven machine translation and beyond, Ann Arbor, MI, pp 183–190Google Scholar
  55. Groves D, Way A (2005b) Hybrid data-driven models of machine translation. Mach Transl 19: 301–323CrossRefGoogle Scholar
  56. Hanneman G, Huber E, Agarwal A, Ambati V, Parlikar A, Peterson E, Lavie A (2008) Statistical transfer systems for French–English and German–English machine translation. In: Proceedings of the Third workshop on statistical machine translation, Columbus, OH, pp 163–166Google Scholar
  57. Haque R, Naskar S, Ma Y, Way A (2009) Using supertags as source language context in SMT. In: EAMT-2009: Proceeding of the 13th annual meeting of the European Association for Machine Translation, Barcelona, Spain, pp 234–241Google Scholar
  58. Hassan H, Ma Y, Way A (2007a) MaTrEx: DCU machine translation system for IWSLT 2007. In: IWSLT 2007, proceedings of the 4th international workshop on spoken language translation, Trento, Italy, pp 69–75Google Scholar
  59. Hassan H, Sima’an K, Way A (2007b) Integrating supertags into phrase-based statistical machine translation. In: ACL 2007: Proceedings of the 45th annual meeting of the Association for Computational Linguistics Prague, Czech Republic, pp 288–295Google Scholar
  60. Hassan H, Sima’an K, Way A (2008) Syntactically lexicalized phrase-based SMT. IEEE Trans Audio Speech Lang Process 16: 1260–1273CrossRefGoogle Scholar
  61. Hearne M (2005) Data-oriented models of parsing and translation. PhD Thesis, Dublin City University, Dublin, IrelandGoogle Scholar
  62. Hearne M, Ozdowska S, Tinsley J (2008) Comparing constituency and dependency representations for SMT phrase extraction. In: 15ème conférence sur le traitement automatique des langues naturelles (TALN 2008), Avignon, FranceGoogle Scholar
  63. Hearne M, Way A (2006) Dismabiguation strategies for data-oriented translation. In: 11th annual conference of the European Association for Machine Translation, proceedings, Oslo, Norway, pp 59–68Google Scholar
  64. Hearne M, Way A (in press) Statistical machine translation: a guide for linguists and translators. To appear in Lang Linguist Compass 4Google Scholar
  65. Hutchins J (2005) Example-based machine translation: a review and commentary. Mach Transl 19: 197–211CrossRefGoogle Scholar
  66. Imamura K, Sumita E, Matsumoto Y (2002) Automatic construction of machine translation knowledge using translation literalness. In: Proceedings of the 9th international conference on theoretical and methodological issues in machine translation (TMI-2002), Keihanna, Japan, pp 155–162Google Scholar
  67. Johnson H, Joel M, Foster G, Kuhn R (2007) Improving translation quality by discarding most of the phrasetable. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), Prague, Czech Republic, pp 967–975Google Scholar
  68. Johnson M (2002) The DOP estimation method is biased and inconsistent. Comput Linguist 28: 71–76CrossRefGoogle Scholar
  69. Kaji H, Kida Y, Morimoto Y (1992) Learning translation templates from bilingual text. In: Proceedings of the fifteenth [sic] international conference on computational linguistics, COLING-92, Nantes, France, pp 672–678Google Scholar
  70. Kim J, Brown R, Jansen P, Carbonell J (2005) Symmetric probabilistic alignment for example-based translation. In: 10th EAMT conference, practical applications of machine translation, proceedings, Budapest, Hungary, pp 153–159Google Scholar
  71. Kitano H (1994) Speech-to-speech translation: a massively parallel memory-based approach. Kluwer, BostonMATHGoogle Scholar
  72. Koehn P (2002) Europarl: a multilingual corpus for evaluation of machine translation, 18 pp. http://people.csail.mit.edu/koehn/publications/europarl.ps
  73. Koehn P (2004) Pharaoh: a beam search decoder for phrase-based statistical machine translation models. In: Frederking R, Taylor K (eds) Machine translation: from real users to research; 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, LNAI 3265. Springer, Berlin, pp 115–124Google Scholar
  74. Koehn P (2005) Europarl: a parallel corpus for statistical machine translation. In: MT summit X, the tenth machine translation summit, Phuket, Thailand, pp 79–86Google Scholar
  75. Koehn P, Birch A, Steinberger R (2009) 462 machine translation systems for Europe. In: MT summit XII, proceedings of the twelfth machine translation summit, Ottawa, ON, Canada, pp 65–72Google Scholar
  76. Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowen B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: ACL 2007 proceedings of the interactive poster and demonstration sessions, Prague, Czech Republic, pp 177–180Google Scholar
  77. Koehn P, Och F, Marcu D (2003) Statistical phrase-based machine translation. In: HLT-NAACL: human language technology conference of the North American chapter of the Association for Computational Linguistics, Edmonton, Alberta, Canada, pp 127–133Google Scholar
  78. Lambert P, Banchs R, Crego J (2007) Discriminative alignment training without annotated data for machine translation. In: Human language technologies 2007: the conference of the North American chapter of the Association for Computational Linguistics, Rochester, NY, pp 85–88Google Scholar
  79. Langlais P, Gotti F (2006) EBMT by tree-phrasing. Mach Transl 20: 1–23CrossRefGoogle Scholar
  80. Lardilleux A, Chevelu J, Lepage Y, Gosme J, Putois G (2009) Lexicons or phrase tables? An investigation in sampling-based multilingual alignment. In: Proceedings of the 3rd international workshop on example-based machine translation, Dublin, Ireland, pp 45–52Google Scholar
  81. Lavie A (2008) Stat-XFER: a general search-based syntax-driven framework for machine translation. In: CICLing-2008: proceedings of the 9th international conference on intelligent text processing and computational linguistics. LNCS 4919. Springer, Berlin, pp 362–375Google Scholar
  82. Lepage Y, Denoual E (2005) Purest ever example-based machine translation: detailed presentation and assessment. Mach Transl 19: 251–282CrossRefGoogle Scholar
  83. Li Z, Chao W, Chen Y (2008) An example-based decoder for spoken language machine translation. In: IJCNLP 2008: sixth SIGHAN workshop on Chinese language processing, proceedings, Hyderabad, India, pp 1–8Google Scholar
  84. Liang P, Taskar B, Klein D (2006) Alignment by agreement. In: Proceedings of the human language technology conference of the North American chapter of the Association of Computational Linguistics. New York, NY, pp 104–111Google Scholar
  85. Liu Y, Shriberg E, Stolcke A, Hillard D, Ostendorf M, Harper M (2006a) Enriching speech recognition with automatic detection of sentence boundaries and disfluencies. IEEE Trans Audio Speech Lang Process 14: 1526–1540CrossRefGoogle Scholar
  86. Liu Z, Wang H, Wu H (2006b) Example-based machine translation based on tree-string correspondence and statistical generation. Mach Transl 20: 25–41CrossRefGoogle Scholar
  87. Lopez A (2008) Tera-scale translation models via pattern matching. In: Coling 2008: 22nd international conference on computational linguistics, Manchester, UK, pp 505–512Google Scholar
  88. Lu Y, Huang J, Liu Q (2007) Improving statistical machine translation performance by training data selection and optimization. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL 2007), Prague, Czech Republic, pp 343–350Google Scholar
  89. Ma Y, Stroppa N, Way A (2007a) Booststrapping word alignment via word packing. In: Proceedings of the 45th annual meeting of the Association for Computational Linguistics (ACL-07), Prague, Czech Republic, pp 304–311Google Scholar
  90. Ma Y, Stroppa N, Way A (2007b) Alignment-guided chunking. In: TMI 2007: Proceedings of the 11th international conference on theoretical and methodological issues in machine translation, Skövde, [Sweden], pp 114–121Google Scholar
  91. Ma Y, Sun Y, Ozdowska S, Way A (2008) Improving word alignment using syntactic dependencies. In: Proceedings of the second workshop on syntax and structure in statistical translation (SSST-2), Columbus, OH, pp 69–77Google Scholar
  92. Malavazos C, Piperidis S (2000) Application of analogical modelling to example based machine translation. In: Proceedings of the 18th international conference on computational linguistics: COLING 2000 in Europe, Saarbrücken, Germany, pp 516–522Google Scholar
  93. Marcu D, Wang W, Echihabi A, Knight K (2006) SPMT: statistical machine translation with syntactified target language phrases. In: Proceedings of the 2006 conference on empirical methods in natural language processing (EMNLP 2006), Sydney, Australia, pp 44–52Google Scholar
  94. Marcu D, Wong W (2002) A phrase-based, joint probability model for statistical machine translation. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP-02), Philadelphia, PA, pp 133–139Google Scholar
  95. Marton Y, Resnik P (2008) Soft syntactic constraints for hierarchical phrase-based translation. In: 46th annual meeting of the Association for Computational Linguistics: human language technologies, Columbus, OH, pp 1003–1011Google Scholar
  96. Maruyama H, Watanabe H (1992) Tree cover search algorithm for example-based translation. In: Fourth international conference on theoretical and methodological issues in machine translation: empiricist vs. rationalist methods in MT, TMI-92, Montréal, Canada, pp 173–184Google Scholar
  97. Matsumoto Y, Kitamura M (1995) Acquisition of translation rules from parallel corpora. In: Mitkov R, Nicolov N (eds) Recent advances in natural language processing: selected papers from the conference. John Benjamins, Amsterdam, pp 405–416Google Scholar
  98. McTait K (2003) Translation patterns, linguistic knowledge and complexity in an approach to EBMT. In: Carl and Way (2003) pp 307–338Google Scholar
  99. McTait K, Olohan M, Trujillo A (1999) A building blocks approach to translation memory. In: Translating and the computer 21, London (pages not numbered)Google Scholar
  100. Menezes A, Richardson S (2003) A best-first alignment algorithm for automatic extraction of transfer mappings from bilingual corpora. In: Carl and Way (2003) pp 421–442Google Scholar
  101. Morrissey S, Way A (2006) Lost in translation: the problems of using mainstream MT evaluation metrics for sign language translation. In: Proceedings of the SALTMIL workshop on minority languages, 5th international conference on language resources and evaluation (LREC 2006), Genoa, Italy, pp 91–98Google Scholar
  102. Morrissey S, Way A, Stein D, Bungeroth J, Ney H (2007) Combining data-driven MT systems for improved sign language translation. In: MT summit XI, proceedings, Copenhagen, Denmark, pp 329–336Google Scholar
  103. Nagao M (1984) A framework of a mechanical translation between Japanese and English by analogy principle. In: Elithorn A, Banerji R (eds) Artificial and human intelligence. North-Holland, Amsterdam, pp 173–180Google Scholar
  104. Nakazawa T, Kurohashi S (2008) Kyoto-U: syntactical [sic] EBMT system for NTCIR-7 patent translation task. In: Proceedings of NTCIR-7 workshop meeting, Tokyo, Japan, pp 401–408Google Scholar
  105. Nakazawa T, Yu K, Kawahara D, Kurohashi S (2006) Example-based machine translation based on deeper NLP. In: IWSLT 2006, proceedings of the 3rd international workshop on spoken language translation, Kyoto, Japan, pp 64–70Google Scholar
  106. Nie J-Y, Cai J (2001) Filtering noisy parallel corpora of web pages. In: IEEE symposium on NLP and knowledge engineering, Tucson, AZ, pp 453–458Google Scholar
  107. Nirenburg S, Domashnev C, Grannes DJ (1993) Two approaches to matching in example-based machine translation. In: Proceedings of the fifth international conference on theoretical and methodological issues in machine translation TMI ’93: MT in the next generation, Kyoto, Japan, pp 47–57Google Scholar
  108. Nomiyama H (1992) Machine translation by case generalization. In: Proceedings of the fifteenth [sic] international conference on computational linguistics, COLING-92, Nantes, France, pp 714–720Google Scholar
  109. Och F, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: 40th annual meeting of the Association for Computational Linguistics, Philadelphia, PA, pp 295–302Google Scholar
  110. Och F, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29: 19–51CrossRefGoogle Scholar
  111. Och F, Ney H (2004) The alignment template approach to statistical machine translation. Comput Linguist 30: 417–449CrossRefGoogle Scholar
  112. Öz Z, Cicekli I (1998) Ordering translation templates by assigning confidence factors. In: Farwell D, Gerber L, Hovy E (eds) Machine translation and the information soup: third conference of the Association for Machine Translation in the Americas, AMTA’98. Springer, Berlin, pp 51–61Google Scholar
  113. Ozdowska S, Way A (2009) Optimal bilingual data for French–English PB-SMT. In: Proceedings of EAMT-09, the 13th annual meeting of the European Association for Machine Translation, Barcelona, Spain, pp 96–103Google Scholar
  114. Petrov S, Klein D (2007) Improved inference for unlexicalized parsing. In: Human language technologies 2007: the conference of the North American chapter of the Association for Computational Linguistics, Rochester, NY, pp 404–411Google Scholar
  115. Phillips A (2007) Sub-phrasal matching and structural templates in example-based MT. In: TMI 2007: proceedings of the 11th international conference on theoretical and methodological issues in machine translation, Skövde, [Sweden], pp 163–170Google Scholar
  116. Phillips A, Brown R (2009) Cunei machine translation platform: system description. In: Proceedings of the 3rd international workshop on example-based machine translation, Dublin, Ireland, pp 29–36Google Scholar
  117. Quirk C, Menezes A (2006) Dependency treelet translation: the convergence of statistical and example-based machine translation?. Mach Transl 20: 43–65CrossRefGoogle Scholar
  118. Rapp R, Martin Vide C (2006) Example-based machine translation using a dictionary of word pairs. In: LREC-2006: fifth international conference on language resources and evaluation, proceedings, Genoa, Italy, pp 1268–1273Google Scholar
  119. Richardson SD, Dolan WB, Menezes A, Pinkham J (2001) Achieving commercial-quality translation with example-based methods. In: MT summit VIII: machine translation in the information age, Santiago de Compostela, Spain, pp 293–298Google Scholar
  120. Sadler V, Vendelmans R (1990) Pilot implementation of a bilingual knowledge bank. In: COLING-90, papers presented to the 13th international conference on computational linguistics, Helsinki, Finland, vol 3, pp 449–451Google Scholar
  121. Samuelsson Y, Volk M (2007) Alignment tools for parallel treebanks. In: Proceedings of the biennial GLDV conference, Tübingen, GermanyGoogle Scholar
  122. Sánchez-Martínez F, Forcada M, Way A (2009) Hybrid rule-based – example-based MT: feeding Apertium with sub-sentential translation units. In: Proceedings of the 3rd international workshop on example-based machine translation, Dublin, Ireland, pp 11–18Google Scholar
  123. Sánchez-Martínez F, Way A (2009) Marker-based filtering of bilingual phrase pairs for SMT. In: Proceedings of EAMT-09, the 13th annual meeting of the European Association for Machine Translation, Barcelona, Spain, pp 144–151Google Scholar
  124. Sato S (1993) Example-based translation of technical terms. In: Proceedings of the fifth international conference on theoretical and methodological issues in machine translation TMI ’93: MT in the next generation, Kyoto, Japan, pp 58–68Google Scholar
  125. Sato S, Nagao M (1990) Toward memory-based translation. In: COLING-90, papers presented to the 13th international conference on computational linguistics, Helsinki, Finland, vol 3, pp 247–252Google Scholar
  126. Schäler R, Way A, Carl M (2003) Example-based machine translation in a controlled environment. In: Carl and Way (2003) pp 83–114Google Scholar
  127. Shen L, Sarkar A, Och FJ (2004) Discriminative reranking for machine translation. In: Proceedings of the human language technology conference of the North American chapter of the Association for Computational Linguistics: HLT-NAACL 2004, Boston, MA, pp 177–184Google Scholar
  128. Shriberg E, Stolcke A, Hakkani-Tür D, Tür G (2000) Prosody-based automatic segmentation of speech into sentences and topics. Speech Commun 32: 127–154CrossRefGoogle Scholar
  129. Sima’an K, Buratto L (2003) Backoff parameter estimation for the DOP model. In: Proceedings of the 14th European conference on machine learning (ECML’03), Cavtat-Dubrovnik, Croatia, pp 373–384Google Scholar
  130. Simard M, Langlais P (2002) Merging example-based and statistical machine translation. In: Richardson S (ed) Machine translation: from research to real users, 5th conference of the Association for Machine Translation in the Americas (AMTA-2002). LNAI 2499. Springer, Berlin, pp 104–113Google Scholar
  131. Simard M, Ueffing N, Isabelle P, Kuhn R (2007) Rule-based translation with statistical phrase-based post-editing. In: ACL 2007: proceedings of the second workshop on statistical machine translation, Prague, Czech Republic, pp 203–206Google Scholar
  132. Smith J, Clark S (2009) EBMT for SMT: a new EBMT–SMT hybrid. In: Proceedings of the 3rd international workshop on example-based machine translation, Dublin, Ireland, pp 3–10Google Scholar
  133. Somers H (1999) Review article: example-based machine translation. Mach Transl 14: 113–157CrossRefGoogle Scholar
  134. Somers H (2003) An overview of EBMT. In: Carl and Way (2003) pp 3–57Google Scholar
  135. Somers H, Dandapat S, Naskar S (2009) A review of EBMT using proportional analogies. In: Proceedings of the 3rd international workshop on example-based machine translation, Dublin, Ireland, pp 53–60Google Scholar
  136. Somers H, McLean I, Jones D (1994) Experiments in multilingual example-based generation. In: CSNLP 1994: 3rd international conference on the cognitive science of natural language processing, Dublin, Ireland (pages not numbered)Google Scholar
  137. Srivastava A, Penkale S, Groves D, Tinsley J (2009) Evaluating syntax-driven approaches to phrase extraction for MT. In: Proceedings of the 3rd international workshop on example-based machine translation, Dublin, Ireland, pp 19–28Google Scholar
  138. Stein D, Dreuw P, Ney H, Morrissey S, Way A (2007) Hand in hand: automatic sign language to English translation. In: TMI 2007: proceedings of the 11th international conference on theoretical and methodological issues in machine translation, Skövde, [Sweden], pp 214–220Google Scholar
  139. Stroppa N, van den Bosch A, Way A (2007) Exploiting source similarity for SMT using context-informed features. In: TMI 2007: proceedings of the 11th international conference on theoretical and methodological issues in machine translation, Skövde, [Sweden], pp 231–240Google Scholar
  140. Stroppa N, Way A (2006) MaTrEx: DCU machine translation system for IWSLT 2006. In: IWSLT 2006, proceedings of the 3rd international workshop on spoken language translation, Kyoto, Japan, pp 31–36Google Scholar
  141. Sumita E (2003) An example-based machine translation system using DP-matching between word sequences. In: Carl and Way (2003) pp 189–209Google Scholar
  142. Sumita E, Furuse O, Iida H (1993a) An example-based disambiguation of prepositional phrase attachment. In: TMI-93: the fifth international conference on theoretical and methodological issues in machine translation, proceedings, Kyoto, Japan, pp 80–91Google Scholar
  143. Sumita E, Iida H (1991) Experiments and prospects of example-based machine translation. In: 29th annual meeting of the Association for Computational Linguistics (ACL-91), Berkeley, CA, pp 185–192Google Scholar
  144. Sumita E, Iida H, Kohyama H (1990) Translating with examples: a new approach to machine translation. In: Third international conference on theoretical and methodological issues in machine translation of natural language, Austin, TX, pp 203–212Google Scholar
  145. Sumita E, Oi K, Furuse O, Iida H, Higuchi T, Takahashi N, Kitano H (1993b) Example-based machine translation on massively parallel processors. In: Proceedings of the thirteenth international joint conference on artificial intelligence, Chambéry, France, pp 1283–1288Google Scholar
  146. Sumita E, Tsutsumi Y (1988) A translation aid system using flexible text retrieval based on syntax-matching. In: Second international conference on theoretical and methodological issues in machine translation of natural languages, proceedings supplement, Pittsburgh, PA (pages not numbered)Google Scholar
  147. Tiedemann J (2003) Combining clues for word alignment. In: 10th EAMT conference, practical applications of machine translation, proceedings, Budapest, Hungary, pp 339–346Google Scholar
  148. Tinsley J, Hearne M, Way A (2007a) Exploiting parallel treebanks to improve phrase-based statistical machine translation. In: Proceedings of the sixth international workshop on treebanks and linguistic theories (TLT-07), Bergen, Norway, pp 175–187Google Scholar
  149. Tinsley J, Ma Y, Ozdowska S, Way A (2008) MaTrEx: the DCU MT system for WMT 2008. In: Third workshop on statistical machine translation, Columbus, OH, pp 171–174Google Scholar
  150. Tinsley J, Way A (2009) Automatically generated parallel treebanks and their exploitability in machine translation. Mach Transl 23: 1–22CrossRefGoogle Scholar
  151. Tinsley J, Zhechev V, Hearne M, Way A (2007b) Robust language pair-independent sub-tree alignment. In: MT summit XI, Copenhagen, Denmark, pp 329–336Google Scholar
  152. Turcato D, Popowich F (2003) What is example-based machine translation? In: Carl and Way (2003) pp 59–81Google Scholar
  153. Turchi M, De Bie T, Cristianini N (2008) Learning performance of a machine translation system: a statistical and computational analysis. In: Third workshop on statistical machine translation, Columbus, OH, pp 35–43Google Scholar
  154. Utsuro T, Uchimoto K, Matsumoto M, Nagao M (1994) Thesaurus-based efficient example retrieval by generating retrieval queries from similarities. In: COLING 94: the 15th international conference on computational linguistics, Kyoto, Japan, pp 1044–1048Google Scholar
  155. Vandeghinste V, Dirix P, Schuurman I (2005) Example-based translation without parallel corpora: first experiments on a prototype. In: Proceedings of the second workshop on example-based machine translation, MT summit X, Phuket, Thailand, pp 135–142Google Scholar
  156. Vandeghinste V, Martens S (2009) Top-down transfer in example-based MT. In: Proceedings of the 3rd international workshop on example-based machine translation, Dublin, Ireland, pp 69–76Google Scholar
  157. van den Bosch A, Stroppa N, Way A (2007) A memory-based classification approach to marker-based EBMT. In: Proceedings of the METIS-II workshop on new approaches to machine translation, Leuven, Belgium, pp 63–72Google Scholar
  158. van Gompel M, van den Bosch A, Berck P (2009) Extending memory-based machine translation to phrases. In: Proceedings of the 3rd international workshop on example-based machine translation, Dublin, Ireland, pp 61–68Google Scholar
  159. Vauquois B (1968) A survey of formal grammars and algorithms for recognition and transformation in machine translation. In: IFIP congress-68, Edinburgh, pp 254–260; reprinted in Boitet Ch (ed) (1988) Bernard Vauquois et la TAO: vingt-cinq ans de traduction automatique – Analectes. Association Champollion, Grenoble, pp 201–213Google Scholar
  160. Veale T, Way A (1997) Gaijin: a bootstrapping approach to example-based machine translation. In: International conference, recent advances in natural language processing, Tzigov Chark, Bulgaria, pp 239–244Google Scholar
  161. Vilar D, Stein D, Ney H (2008) Analysing soft syntax features and heuristics for hierarchical phrase-based machine translation. In: Proceedings of the international workshop on spoken language translation, Honolulu, HI, pp 190–197Google Scholar
  162. Volk M, Samuelsson Y (2004) Bootstrapping parallel treebanks. In: Proceedings of the 5th international workshop on linguistically interpreted corpora (COLING 2004), Geneva, Switzerland, pp 63–69Google Scholar
  163. Wagner RA, Fischer MJ (1974) The string-to-string correction problem. J Assoc Comput Mach 21: 168–173MATHMathSciNetGoogle Scholar
  164. Wang S (2002) Machine translation on noisy training data. Master’s Thesis, University of Southern California, San DiegoGoogle Scholar
  165. Watanabe H (1992) A similarity-driven transfer system. In: Proceedings of the fifteenth [sic] international conference on computational linguistics, COLING-92, Nantes, France, pp 770–776Google Scholar
  166. Watanabe H (1994) A method for distinguishing exceptional and general examples in example-based transfer systems. In: COLING 94: the 15th international conference on computational linguistics, Kyoto, Japan, pp 39–44Google Scholar
  167. Way A (1999) A hybrid architecture for robust MT using LFG-DOP. J Exp Theor Artif Intell 11: 441–471CrossRefGoogle Scholar
  168. Way A (2003) Machine translation using LFG-DOP. In: Bod R, Scha R, Sima’an K (eds) Data-oriented parsing. CSLI, Stanford, pp 359–384Google Scholar
  169. Way A (2009) A critique of statistical machine translation. Linguist Antverp, new series 7: 17–41Google Scholar
  170. Way A (2010) Machine translation. In: Lappin S, Fox C, Clark A (eds) Handbook for computational linguistics and natural language processing. Wiley Blackwell, Chichester, pp 531–573CrossRefGoogle Scholar
  171. Way A, Gough N (2003) wEBMT: developing and validating an EBMT system using the world wide web. Comput Linguist 29: 421–457CrossRefGoogle Scholar
  172. Way A, Gough N (2005a) Controlled translation in an example-based environment. Mach Transl 19: 1–36CrossRefGoogle Scholar
  173. Way A, Gough N (2005b) Comparing example-based and statistical machine translation. Nat Lang Eng 11: 295–309CrossRefGoogle Scholar
  174. Whitelock P (1992) Shake-and-bake translation. In: Proceedings of the fifteenth [sic] international conference on computational linguistics, COLING-92, Nantes, France, pp 784–791Google Scholar
  175. Whitelock P, Poznanski V (2006) The SLE example-based translation system. In: IWSLT 2006, proceedings of the 3rd international workshop on spoken language translation, Kyoto, Japan, pp 111–115Google Scholar
  176. Wu D (1997) Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Comput Linguist 23: 377–403Google Scholar
  177. Wu D (2005) MT model space: statistical vs. compositional vs. example-based machine translation. Mach Transl 19: 213–227CrossRefGoogle Scholar
  178. Yamamoto K, Matsumoto Y (2003) Extracting translation knowledge from parallel corpora. In: Carl and Way (2003) pp 365–395Google Scholar
  179. Yang M, Zheng J (2009) Toward smaller, faster, and better hierarchical phrase-based SMT. In: Joint conference of the 47th annual meeting of the Associational for Computational Linguistics and the 4th international joint conference on natural language processing of the AFNLP, proceedings of the conference short papers, Suntec, Singapore, pp 237–240Google Scholar
  180. Zens R, Ney H (2004) Improvements in phrase-based statistical machine translation. In: Proceedings of the human language technology conference of the North American chapter of the Association for Computational Linguistics: HLT-NAACL 2004, Boston, MA, pp 257–264Google Scholar
  181. Zhang Y, Vogel S (2005) An efficient phrase-to-phrase alignment model for arbitrarily long phrase [sic] and large corpora. In: 10th EAMT conference, practical applications of machine translation, proceedings, Budapest, Hungary, pp 294–301Google Scholar
  182. Zhao B, Al-Onaizan Y (2008) Generalizing local and non-local word-reordering patterns for syntax-based machine translation. In: EMNLP 2008, conference on empirical methods in natural language processing, Waikiki, HI, pp 572–581Google Scholar
  183. Zollmann A, Sima’an K (2005) A consistent and efficient estimator for data-oriented parsing. J Autom Lang Comb 10: 367–388MATHMathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2010

Authors and Affiliations

  1. 1.CNGL, School of ComputingDublin City UniversityDublin 9Ireland

Personalised recommendations