Machine Translation

, Volume 28, Issue 3–4, pp 309–339 | Cite as

Online adaptation to post-edits for phrase-based statistical machine translation

  • Nicola Bertoldi
  • Patrick Simianer
  • Mauro Cettolo
  • Katharina Wäschle
  • Marcello Federico
  • Stefan Riezler
Article

Abstract

Recent research has shown that accuracy and speed of human translators can benefit from post-editing output of machine translation systems, with larger benefits for higher quality output. We present an efficient online learning framework for adapting all modules of a phrase-based statistical machine translation system to post-edited translations. We use a constrained search technique to extract new phrase-translations from post-edits without the need of re-alignments, and to extract phrase pair features for discriminative training without the need for surrogate references. In addition, a cache-based language model is built on \(n\)-grams extracted from post-edits. We present experimental results in a simulated post-editing scenario and on field-test data. Each individual module substantially improves translation quality. The modules can be implemented efficiently and allow for a straightforward stacking, yielding significant additive improvements on several translation directions and domains.

Keywords

Statistical machine translation Post-editing Online adaptation 

Notes

Acknowledgments

FBK researchers were supported by the MateCat project, funded by the EC under FP7; researchers at Heidelberg University by DFG grant “Cross-language Learning-to-Rank for Patent Retrieval”.

References

  1. Bertoldi N (2014) Dynamic models in Moses for online adaptation. Prague Bull Math Linguist 101:7–28CrossRefGoogle Scholar
  2. Bertoldi N, Cettolo M, Federico M, Buck C (2012) Evaluating the learning curve of domain adaptive statistical machine translation systems. In: Proceedings of the seventh workshop on statistical machine translation. Montréal, Canada, pp 433–441Google Scholar
  3. Bertoldi N, Cettolo M, Federico M (2013) Cache-based online adaptation for machine translation enhanced computer assisted translation. In: Proceedings of the MT summit XIV, Nice, France, pp 35–42Google Scholar
  4. Bisazza A, Ruiz N, Federico M (2011) Fill-up versus interpolation methods for phrase-based SMT adaptation. In: Proceedings of the international workshop on spoken language translation (IWSLT). San Francisco, California, USA, pp 136–143Google Scholar
  5. Cesa-Bianchi N, Lugosi G (2006) Prediction, learning, and games. Cambridge University Press, CambridgeCrossRefMATHGoogle Scholar
  6. Cesa-Bianchi N, Reverberi G, Szedmak S (2008) Online learning algorithms for computer-assisted translation. Technical report, SMART (www.smart-project.eu)Google Scholar
  7. Cettolo M, Federico M, Bertoldi N (2010) Mining parallel fragments from comparable texts. In: Proceedings of the international workshop on spoken language translation (IWSLT), Paris, France, pp 227–234Google Scholar
  8. Cettolo M, Bertoldi N, Federico M (2011) Methods for smoothing the optimizer instability in SMT. In: Proceedings of the MT summit XIII, Xiamen, China, pp 32–39Google Scholar
  9. Chiang D, Marton Y, Resnik P (2008) Online large-margin training of syntactic and structural translation features. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), Honolulu, Hawaii, USA, pp 224–233Google Scholar
  10. Clark JH, Dyer C, Lavie A, Smith NA (2011) Better hypothesis testing for statistical machine translation: controlling for optimizer instability. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT): short papers, vol 2, Portland, Oregon, USA, pp 176–181Google Scholar
  11. Collins M (2002) Discriminative training methods for hidden markov models: theory and experiments with perceptron algorithms. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), Philadelphia, Pennsylvania, USA, pp 1–8Google Scholar
  12. Denkowski M, Dyer C, Lavie A (2014) Learning from post-editing: online model adaptation for statistical machine translation. Proceedings of the 14th conference of the European chapter of the Association for Computational Linguistics (EACL’14), Gothenburg, Sweden, pp 395–404Google Scholar
  13. Federico M, Bertoldi N, Cettolo M (2008) IRSTLM: an open source toolkit for handling large scale language models. In: Proceedings of interspeech, Brisbane, Australia, pp 1618–1621Google Scholar
  14. Federico M, Cattelan A, Trombetti M (2012) Measuring user productivity in machine translation enhanced computer assisted translation. In: Proceedings of the tenth conference of the Association for Machine Translation in the Americas (AMTA), San Diego, California, USAGoogle Scholar
  15. Foster G, Kuhn R (2007) Mixture-model adaptation for SMT. In: Proceedings of the second workshop on statistical machine translation. Prague, Czech Republic, pp 128–135Google Scholar
  16. Green S, Heer J, Manning C (2013) The efficacy of human post-editing for language translation. In: Proceedings of the SIGCHI conference on human factors in computing systems. Paris, France, pp 439–448Google Scholar
  17. Hardt D, Elming J (2010) Incremental re-training for post-editing SMT. In: Proceedings of the conference of the Association for Machine Translation in the Americas (AMTA), Denver, Colorado, USA, pp 217–237Google Scholar
  18. Koehn P (2010) Statistical machine translation. Cambridge University Press, CambridgeMATHGoogle Scholar
  19. Koehn P, Schroeder J (2007) Experiments in domain adaptation for statistical machine translation. In: Proceedings of the second workshop on statistical machine translation. Prague, Czech Republic, pp 224–227Google Scholar
  20. Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the Association for Computational Linguistics companion volume proceedings of the demo and poster sessions. Prague, Czech Republic, pp 177–180Google Scholar
  21. Kuhn R, De Mori R (1990) A cache-based natural language model for speech recognition. IEEE Trans Pattern Anal Machine Intell 12(6):570–582Google Scholar
  22. Läubli S, Fishel M, Massey G, Ehrensberger-Dow M, Volk M (2013) Assessing post-editing efficiency in a realistic translation environment. In: Proceedings of the MT summit XIV workshop on post-editing technology and practice, Nice, France, pp 83–91Google Scholar
  23. Levenberg A, Callison-Burch C, Osborne M (2010) Stream-based translation models for statistical machine translation. In: Proceedings of the 2010 annual conference of the North American chapter of the Association for Computational Linguistics (HLT-NAACL), Los Angeles, California, USA, pp 394–402Google Scholar
  24. Levenberg A, Dyer C, Blunsom P (2012) A Bayesian model for learning SCFGs with discontiguous rules. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), Jeju Island, Korea, pp 223–232Google Scholar
  25. Liang P, Bouchard-Côté A, Klein D, Taskar B (2006) An end-to-end discriminative approach to machine translation. In: Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the Association for Computational Linguistics, Sydney, Australia, pp 761–768Google Scholar
  26. Liu L, Cao H, Watanabe T, Zhao T, Yu M, Zhu C (2012) Locally training the log-linear model for SMT. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), Jeju Island, Korea, pp 402–411Google Scholar
  27. López-Salcedo FJ, Sanchis-Trilles G, Casacuberta F (2012) Online learning of log-linear weights in interactive machine translation. In: Proceedings of Iber speech, Madrid, Spain, pp 277–286Google Scholar
  28. Martínez-Gómez P, Sanchis-Trilles G, Casacuberta F (2012) Online adaptation strategies for statistical machine translation in post-editing scenarios. Pattern Recognit 45(9):3193–3202CrossRefMATHGoogle Scholar
  29. Nepveu L, Lapalme G, Langlais P, Foster G (2004) Adaptive language and translation models for interactive machine translation. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), Barcelona, Spain, pp 190–197Google Scholar
  30. Noreen EW (1989) Computer intensive methods for testing hypotheses: an introduction. Wiley Interscience, New YorkGoogle Scholar
  31. Och FJ (2003) Minimum error rate training in statistical machine translation. In: Proceedings of the 41st annual meeting of the Association for Computational Linguistics. Sapporo, Japan, pp 160–167Google Scholar
  32. Ortiz-Martínez D, García-Varea I, Casacuberta F (2010) Online learning for interactive statistical machine translation. In: Proceedings of the 2010 annual conference of the North American chapter of the Association of Computational Linguistics (HLT-NAACL), Los Angeles, pp 546–554Google Scholar
  33. Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association of Computational Linguistics (ACL). Philadelphia, Pennsylvania, USA, pp 311–318Google Scholar
  34. Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: Proceedings of the 5th conference of the Association for Machine Translation in the Americas (AMTA). Cambridge, Massachusetts, USA, pp 223–231Google Scholar
  35. Steinberger R, Pouliquen B, Widiger A, Ignat C, Erjavec T, Tufiş D, Varga D (2006) The JRC-Acquis: a multilingual aligned parallel corpus with 20+ languages. In: Proceedings of the 5th international conference on language resources and evaluation (LREC). Genoa, Italy, pp 2142–2147Google Scholar
  36. Tiedemann J (2010) Context adaptation in statistical machine translation using models with exponentially decaying cache. In: Proceedings of the 2010 ACL workshop on domain adaptation for natural language processing, Uppsala, Sweden, pp 8–15Google Scholar
  37. Tiedemann J (2012) Parallel data, tools and interfaces in OPUS. Proceedings of the 8th international conference on language resources and evaluation (LREC), Istanbul, Turkey, pp 2214–2218Google Scholar
  38. Wäschle K, Riezler S (2012) Analyzing parallelism and domain similarities in the MAREC patent corpus. Proceedings of the 5th information retrieval facility conference (IRFC), Vienna, Austria, pp 12–27.Google Scholar
  39. Wäschle K, Simianer P, Bertoldi N, Riezler S, Federico M (2013) Generative and discriminative methods for online adaptation in SMT. In: Proceedings of the MT summit XIV, Nice, France, pp 11–18Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  • Nicola Bertoldi
    • 1
  • Patrick Simianer
    • 2
  • Mauro Cettolo
    • 1
  • Katharina Wäschle
    • 2
  • Marcello Federico
    • 1
  • Stefan Riezler
    • 2
  1. 1.FBK - Fondazione Bruno KesslerTrentoItaly
  2. 2.Department of Computational LinguisticsHeidelberg UniversityHeidelbergGermany

Personalised recommendations