Advertisement

Machine Translation

, Volume 32, Issue 4, pp 309–324 | Cite as

A user-study on online adaptation of neural machine translation to human post-edits

  • Sariya Karimova
  • Patrick Simianer
  • Stefan Riezler
Article
  • 86 Downloads

Abstract

The advantages of neural machine translation (NMT) have been extensively validated for offline translation of several language pairs for different domains of spoken and written language. However, research on interactive learning of NMT by adaptation to human post-edits has so far been confined to simulation experiments. We present the first user study on online adaptation of NMT to user post-edits in the domain of patent translation. Our study involves 29 human subjects (translation students) whose post-editing effort and translation quality were measured on about 4500 interactions of a human post-editor and an NMT system integrating an online adaptive learning algorithm. Our experimental results show a significant reduction in human post-editing effort due to online adaptation in NMT according to several evaluation metrics, including hTER, hBLEU, and KSMR. Furthermore, we found significant improvements in BLEU/TER between NMT outputs and professional translations in granted patents, providing further evidence for the advantages of online adaptive NMT in an interactive setup.

Keywords

Online adaptation Post-editing Neural machine translation 

Notes

Acknowledgements

The research reported in this paper was supported in part by the German research foundation (DFG) under Grant RI-2221/4-1.

References

  1. Baayen RH, Davidson DJ, Bates DM (2008) Mixed-effects modeling with crossed random effects for subjects and items. J Mem Lang 59(4):390–412CrossRefGoogle Scholar
  2. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of the international conference on learning representations (ICLR), San Diego, pp 1–15Google Scholar
  3. Barr DJ, Levy R, Scheepers C, Tilly HJ (2013) Random effects structure for confirmatory hypothesis testing: keep it maximal. J Mem Lang 68(3):255–278CrossRefGoogle Scholar
  4. Barrachina S, Bender O, Casacuberta F, Civera J, Cubel E, Khadivi S, Lagarda A, Ney H, Tomás J, Vidal E et al (2009) Statistical approaches to computer-assisted translation. Comput Linguist 35(1):3–28MathSciNetCrossRefGoogle Scholar
  5. Bates D, Mächler M, Bolker B, Walker S (2015) Fitting linear mixed-effects models using lme4. J Stat Softw 67(1):1–48CrossRefGoogle Scholar
  6. Bentivogli L, Bertoldi N, Cettolo M, Federico M, Negri M, Turchi M (2016a) On the evaluation of adaptive machine translation for human post-editing. IEEE/ACM Trans Audio Speech Lang Process 24(2):388–399CrossRefGoogle Scholar
  7. Bentivogli L, Bisazza A, Cettolo M, Federico M (2016b) Neural versus phrase-based machine translation quality: a case study. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), Austin, pp 257–267Google Scholar
  8. Bentivogli L, Bisazza A, Cettolo M, Federico M (2018) Neural versus phrase-based MT quality: an in-depth analysis on English–German and English–French. Comput Speech Lang 49:52–70CrossRefGoogle Scholar
  9. Bertoldi N, Simianer P, Cettolo M, Wäschle K, Federico M, Riezler S (2014) Online adaptation to post-edits for phrase-based statistical machine translation. Mach Transl 28:309–339CrossRefGoogle Scholar
  10. Burchardt A, Macketanz V, Dehdari J, Heigold G, Peter JT, Williams P (2017) A linguistic evaluation of rule-based, phrase-based, and neural MT engines. Prague Bull Math Linguist 108(1):159–170CrossRefGoogle Scholar
  11. Castilho S, Moorkens J, Gaspari F, Calixto I, Tinsley J, Way A (2017a) Is neural machine translation the new state of the art? Prague Bull Math Linguist 108(1):109–120CrossRefGoogle Scholar
  12. Castilho S, Moorkens J, Gaspari F, Sennrich R, Sosoni V, Georgakopoulou Y, Lohar P, Way A, Miceli Barone A, Gialama M (2017b) A comparative quality evaluation of PBSMT and NMT using professional translators. In: Proceedings of MT Summit XVI, vol 1. Research Track, Nagoya, pp 116–131Google Scholar
  13. Cesa-Bianchi N, Reverberi G, Szedmak S (2008) Online learning algorithms for computer-assisted translation. Technical report, SMART. http://www.smart-project.eu
  14. Denkowski M, Dyer C, Lavie A (2014a) Learning from post-editing: online model adaptation for statistical machine translation. In: Proceedings of the conference of the European chapter of the association for computational linguistics (EACL), Gothenburg, pp 395–404Google Scholar
  15. Denkowski M, Lavie A, Lacruz I, Dyer C (2014b) Real time adaptive machine translation for post-editing with cdec and transcenter. In: Proceedings of the EACL workshop on humans and computer-assisted translation, Gothenburg, pp 72–77Google Scholar
  16. Farajian MA, Turchi M, Negri M, Bertoldi N, Federico M (2017) Neural vs. phrase-based machine translation in a multi-domain scenario. In: Proceedings of the conference of the european chapter of the association for computational linguistics (EACL), vol 2, Short Papers, Valencia, pp 280–284Google Scholar
  17. Forcada ML (2017) Making sense of neural machine translation. Transl Spaces 6(2):291–309CrossRefGoogle Scholar
  18. Graham Y, Baldwin T, Moffat A, Zobel J (2016) Can machine translation systems be evaluated by the crowd alone? Nat Lang Eng 23(1):3–30CrossRefGoogle Scholar
  19. Green S, Heer J, Manning CD (2013) The efficacy of human post-editing for language translation. In: Proceedings of the SIGCHI conference on human factors in computing systems, Paris, pp 439–448Google Scholar
  20. Green S, Wang S, Chuang J, Heer J, Schuster S, Manning CD (2014) Human effort and machine learnability in computer aided translation. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), Doha, pp 1225–1236Google Scholar
  21. Hardt D, Elming J (2010) Incremental re-training for post-editing SMT. In: Proceedings of the conference of the association for machine translation in the Americas (AMTA), Denver, pp 1–10Google Scholar
  22. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput. 9(8):1735–1780CrossRefGoogle Scholar
  23. Isabelle P, Cherry C, Foster G (2017) A challenge set approach to evaluating machine translation. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), Copenhagen, pp 2486–2496Google Scholar
  24. Jean S, Firat O, Cho K, Memisevic R, Bengio Y (2015) Montreal neural machine translation systems for WMT’15. In: Proceedings of the workshop on statistical machine translation (WMT), Lisbon, pp 134–140Google Scholar
  25. Junczys-Dowmunt M, Dwojak T, Hoang H (2016) Is neural machine translation ready for deployment? A case study on 30 translation directions. In: Proceedings of the international workshop on spoken language translation (IWSLT), Seattle, pp 1–8Google Scholar
  26. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Proceedings of the international conference on learning representations (ICLR), San Diego, pp 1–15Google Scholar
  27. Klubička F, Toral A, Sánchez-Cartagena VM (2017) Fine-grained human evaluation of neural versus phrase-based machine translation. Prague Bull Math Linguist 108(1):121–132CrossRefGoogle Scholar
  28. Klubička F, Toral A, Sánchez-Cartagena VM (2018) Quantitative fine-grained human evaluation of machine translation systems: a case study on English to Croatian. Mach Transl 1–21Google Scholar
  29. Knowles R, Koehn P (2016) Neural interactive translation prediction. In: Proceedings of the conference of the association for machine translation in the Americas (AMTA), Austin, pp 107–120Google Scholar
  30. Koehn P (2005) Europarl: a parallel corpus for statistical machine translation. In: Conference proceedings: the tenth machine translation summit, Phuket, pp 79–86Google Scholar
  31. Koehn P, Knowles R (2017) Six challenges for neural machine translation. In: Proceedings of the first workshop on neural machine translation, Vancouver, pp 28–39Google Scholar
  32. Kreutzer J, Sokolov A, Riezler S (2017) Bandit structured prediction for neural sequence-to-sequence learning. In: Proceedings of the 55th annual meeting of the association for computational linguistics (ACL), vol 1, Long Papers, Vancouver, pp 1503–1513Google Scholar
  33. Kreutzer J, Khadivi S, Matusov E, Riezler S (2018a) Can neural machine translation be improved with user feedback? In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 3, Industry Papers (NAACL-HLT), New Orleans, pp 92–105Google Scholar
  34. Kreutzer J, Uyheng J, Riezler S (2018b) Reliability and learnability of human bandit feedback for sequence-to-sequence reinforcement learning. In: Proceedings of the 56th annual meeting of the association for computational linguistics (ACL), vol 1, Long Papers, Melbourne, pp 1777–1788Google Scholar
  35. Lam TK, Kreutzer J, Riezler S (2018) A reinforcement learning approach to interactive-predictive neural machine translation. In: Proceedings of the 21st annual conference of the European association for machine translation (EAMT), Alicante, pp 169–178Google Scholar
  36. López-Salcedo FJ, Sanchis-Trilles G, Casacuberta F (2012) Online learning of log-linear weights in interactive machine translation. In: Proceedings of IberSpeech: advances in speech and language technologies for Iberian languages, Madrid, pp 277–286Google Scholar
  37. Luong M, Manning CD (2015) Stanford neural machine translation systems for spoken language domains. In: Proceedings of the international workshop on spoken language translation (IWSLT), Da Nang, pp 76–79Google Scholar
  38. Luong M, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), Lisbon, pp 1412–1421Google Scholar
  39. Macketanz V, Avramidis E, Burchardt A, Helcl J, Srivastava A (2017) Machine translation: phrase-based, rule-based and neural approaches with linguistic evaluation. Cybern Inf Technol 17(2):28–43Google Scholar
  40. Martínez-Gómez P, Sanchis-Trilles G, Casacuberta F (2012) Online adaptation strategies for statistical machine translation in post-editing scenarios. Pattern Recogn 45(9):3193–3202CrossRefGoogle Scholar
  41. Nakov P, Guzman F, Vogel S (2012) Optimizing for sentence-level BLEU+1 yields short translations. In: Proceedings of the conference on computational linguistics (COLING), Mumbai, pp 1979–1994Google Scholar
  42. Nepveu L, Lapalme G, Langlais P, Foster G (2004) Adaptive language and translation models for interactive machine translation. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), Barcelona, pp 190–197Google Scholar
  43. Neubig G (2015) lamtram: a toolkit for language and translation modeling using neural networks. http://www.github.com/neubig/lamtram
  44. Neubig G, Dyer C, Goldberg Y, Matthews A, Ammar W, Anastasopoulos A, Ballesteros M, Chiang D, Clothiaux D, Cohn T, Duh K, Faruqui M, Gan C, Garrette D, Ji Y, Kong L, Kuncoro A, Kumar G, Malaviya C, Michel P, Oda Y, Richardson M, Saphra N, Swayamdipta S, Yin P (2017) Dynet: the dynamic neural network toolkit. CoRR arxiv:1701.03980, pp 1–33
  45. Nguyen K, Daumé H, Boyd-Graber J (2017) Reinforcement learning for bandit neural machine translation with simulated feedback. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), Copenhagen, pp 1464–1474Google Scholar
  46. Ortiz-Martínez D, García-Varea I, Casacuberta F (2010) Online learning for interactive statistical machine translation. In: Proceedings of the human language technologies: the 2010 annual conference of the North American chapter of the association for computational linguistics (HLT-NAACL), Los Angeles, pp 546–554Google Scholar
  47. Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics (ACL), Philadelphia, pp 311–318Google Scholar
  48. Peris Á, Domingo M, Casacuberta F (2017) Interactive neural machine translation. Comput Speech Lang 45:201–220CrossRefGoogle Scholar
  49. Popović M (2017) Comparing language related issues for NMT and PBMT between German and English. Prague Bull Math Linguist 108(1):209–220CrossRefGoogle Scholar
  50. Sennrich R, Haddow B, Birch A (2016a) Edinburgh Neural Machine Translation Systems for WMT’16. In: Proceedings of the first conference on machine translation (WMT), Berlin, pp 371–376Google Scholar
  51. Sennrich R, Haddow B, Birch A (2016b) Neural machine translation of rare words with subword units. In: Proceedings of the 54th annual meeting of the association for computational linguistics (ACL), vol 1, Long Papers, Berlin, pp 1715–1725Google Scholar
  52. Shterionov D, Casanellas PNL, Superbo R, O’Dowd T (2017) Empirical evaluation of NMT and PBSMT quality for large-scale translation production. In: Proceedings of the annual conference of the european association for machine translation (EAMT): user studies and project/product descriptions, Prague, pp 74–79Google Scholar
  53. Simianer P, Karimova S, Riezler S (2016) A post-editing interface for immediate adaptation in statistical machine translation. In: Proceedings of the conference on computational linguistics: system demonstrations (COLING Demos), Osaka, pp 16–20Google Scholar
  54. Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: Proceedings of the conference of the association for machine translation in the Americas (AMTA), Cambridge, pp 223–231Google Scholar
  55. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetzbMATHGoogle Scholar
  56. Toral A, Sánchez-Cartagena VM (2017) A multifaceted evaluation of neural versus phrase-based machine translation for 9 language directions. In: Proceedings of the conference of the European chapter of the association for computational linguistics (EACL), vol 1, Long Papers, Valencia, pp 1063–1073Google Scholar
  57. Turchi M, Negri M, Farajian MA, Federico M (2017) Continuous learning from human post-edits for neural machine translation. Prague Bull Math Linguist 108(1):233–244CrossRefGoogle Scholar
  58. Wuebker J, Green S, DeNero J, Hasan S, Luong M (2016) Models and inference for prefix-constrained machine translation. In: Proceedings of the 54th annual meeting of the association for computational linguistics (ACL), vol 1, Long Papers, Berlin, pp 66–75Google Scholar

Copyright information

© Springer Nature B.V. 2018

Authors and Affiliations

  1. 1.Department of Computational LinguisticsHeidelberg UniversityHeidelbergGermany
  2. 2.Kazan Federal UniversityKazanRussia

Personalised recommendations