Advertisement

Combining Machine Translated Sentence Chunks from Multiple MT Systems

  • Matīss Rikters
  • Inguna Skadiņa
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9624)

Abstract

This paper presents a hybrid machine translation (HMT) system that pursues syntactic analysis to acquire phrases of source sentences, translates the phrases using multiple online machine translation (MT) system application program interfaces (APIs) and generates output by combining translated chunks to obtain the best possible translation. The aim of this study is to improve translation quality of English – Latvian texts over each of the individual MT APIs. The selection of the best translation hypothesis is done by calculating the perplexity for each hypothesis using an n-gram language model. The result is a phrase-based multi-system machine translation system that allows to improve MT output compared to individual online MT systems. The proposed approach show improvement up to +1.48 points in BLEU and −0.015 in TER scores compared to the baselines and related research.

Keywords

Machine translation Multi-system machine translation Hybrid machine translation Syntactic parsing Chunking Natural language processing 

Notes

Acknowledgements

The research was supported by Grant 271/2012 from the Latvian Council of Science.

References

  1. 1.
    Vasiļjevs, A., Kalniņš, R., Pinnis, M., Skadiņš, R.: Machine translation for e-Government - the Baltic case. In: Proceedings of AMTA 2014, vol. 2: MT Users, pp. 181–193 (2014)Google Scholar
  2. 2.
    Skadiņš, R., Šics, V., Rozis, R.: Building the world’s best general domain MT for Baltic languages. In: Human Language Technologies – The Baltic Perspective, Proceedings of the Sixth International Conference Baltic HLT 2014, pp. 141–148. IOS Press (2014)Google Scholar
  3. 3.
    Costa-Jussa, M.R., Fonollosa, J.A.R.: Latest trends in hybrid machine translation and its applications. Comput. Speech Lang. 32(1), 3–10 (2015)CrossRefGoogle Scholar
  4. 4.
    Rikters, M., Skadiņa, I.: Syntax-based multi-system machine translation. In: LREC 2016 (2016)Google Scholar
  5. 5.
    Thurmair, G.: Comparing different architectures of hybrid machine translation systems. In: Proceedings of the MT Summit XII, pp. 340–347 (2009)Google Scholar
  6. 6.
    Mellebeek, B., Owczarzak, K., Van Genabith, J., Way, A.: Multi-engine machine translation by recursive sentence decomposition. In: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, pp. 110–118 (2006)Google Scholar
  7. 7.
    Ahsan, A., Kolachina, P.: Coupling statistical machine translation with rule-based transfer and generation. In: AMTA-The Ninth Conference of the Association for Machine Translation in the Americas, Denver, Colorado (2010)Google Scholar
  8. 8.
    Barrault, L.: MANY: open source machine translation system combination. Prague Bull. Math. Linguist. 93, 147–155 (2010)CrossRefGoogle Scholar
  9. 9.
    Hildebrand, A.S., Vogel, St.: CMU system combination for WMT’09. In: Proceedings of the 4th Workshop on SMT, Athens (2009)Google Scholar
  10. 10.
    Rikters, M.: Multi-system machine translation using online APIs for English-Latvian. In: ACL-IJCNLP 2015, p. 6 (2015)Google Scholar
  11. 11.
    Heafield, K., Hanneman, Gr., Lavie, A.: Machine translation system combination with flexible word ordering. In: Proceedings of the 4th Workshop on SMT, Athens (2009)Google Scholar
  12. 12.
    Chen, Y., Jellinghaus, M., Eisele, A., Yi, Zh., Hunsicker, S., Theison, S., Federmann, Ch., Uszkoreit, H.: Combining multi-engine translations with Moses. In: Proceedings of the 4th Workshop on SMT, Athens (2009)Google Scholar
  13. 13.
    Feng, Y., Liu, Y., Mi, H., Liu, Q., Lü, Y.: Lattice-based system combination for statistical machine translation. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3, vol. 3. Association for Computational Linguistics (2009)Google Scholar
  14. 14.
    Freitag, M., Peitz, S., Wuebker, J., Ney, H., Huck, M., Sennrich, R., Durrani, N., Nadejde, M., Williams, P., Koehn, P., Herrmann, T., Cho, E., Waibel, A.: EU-BRIDGE MT: combined machine translation. In: ACL 2014 Ninth Workshop on Statistical Machine Translation (WMT 2014), Baltimore, MD, USA, pp. 105–113 (2014)Google Scholar
  15. 15.
    Freitag, M., Peter, J., Peitz, S., Feng, M., Ney, H.: Local system voting feature for machine translation system combination. In: EMNLP 2015 Tenth Workshop on Statistical Machine Translation (WMT 2015), Lisbon, Portugal, pp. 467–476 (2015)Google Scholar
  16. 16.
    Petrov, S., Barrett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics (2006)Google Scholar
  17. 17.
    Heafield, K.: KenLM: faster and smaller language model queries. In: Proceedings of the Sixth Workshop on Statistical Machine Translation. Association for Computational Linguistics (2011)Google Scholar
  18. 18.
    Gamon, M., Aue, A., Smets, M.: Sentence-level MT evaluation without reference translations: beyond language modeling. In: Proceedings of EAMT (2005)Google Scholar
  19. 19.
    Callison-Burch, C., Flournoy, R.S.: A program for automatically selecting the best output from multiple machine translation engines. In: Proceedings of the Machine Translation Summit VIII (2001)Google Scholar
  20. 20.
    Akiba, Y., Watanabe, T., Sumita, E.: Using language and translation models to select the best among outputs from multiple MT systems. In: Proceedings of the 19th International Conference on Computational Linguistics, vol. 1. Association for Computational Linguistics (2002)Google Scholar
  21. 21.
    Steinberger, R., Pouliquen, B., Widiger, A., Ignat, C., Erjavec, T., Tufis, D., Varga, D.: The JRC-Acquis: a multilingual aligned parallel corpus with 20+ languages. arXiv preprint cs/0609058 (2006)
  22. 22.
    Steinberger, R., Eisele, A., Klocek, S., Pilos, S., Schlüter, P.: DGT-TM: a freely available translation memory in 22 languages. arXiv preprint arXiv:1309.5226 (2013)
  23. 23.
    Skadiņš, R., Goba, K., Šics, V.: Improving SMT for Baltic languages with factored models. In: Proceedings of the Fourth International Conference Baltic HLT 2010. Frontiers in Artificial Intelligence and Applications, vol. 2192, pp. 125–132 (2010)Google Scholar
  24. 24.
    Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics (2002)Google Scholar
  25. 25.
    Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the Second International Conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc. (2002)Google Scholar
  26. 26.
    Klejch, O., Avramidis, E., Burchardt, A., Popel, M.: MT-ComparEval: graphical evaluation interface for machine translation development. Prague Bull. Math. Linguist. 104(1), 63–74 (2015)CrossRefGoogle Scholar
  27. 27.
    Madnani, N.: iBLEU: interactively debugging and scoring statistical machine translation systems. In: 2011 Fifth IEEE International Conference on Semantic Computing (ICSC). IEEE (2011)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.University of LatviaRigaLatvia
  2. 2.Institute of Mathematics and Computer ScienceUniversity of LatviaRigaLatvia

Personalised recommendations