Language-Independent Hybrid MT: Comparative Evaluation of Translation Quality

  • George Tambouratzis
  • Marina Vassiliou
  • Sokratis Sofianopoulos
Chapter
Part of the Theory and Applications of Natural Language Processing book series (NLP)

Abstract

The present chapter reviews the development of a hybrid Machine Translation (MT) methodology, which is readily portable to new language pairs. This MT methodology (which has been developed within the PRESEMT project) is based on sampling mainly monolingual corpora, with very limited use of parallel corpora, thus supporting portability to new language pairs. In designing this methodology, no assumptions are made regarding the availability of extensive and expensive-to-create linguistic resources. In addition, the general-purpose NLP tools used can be chosen interchangeably. Thus PRESEMT circumvents the requirement for specialised resources and tools so as to further support the creation of MT systems for diverse language pairs.

In the current chapter, the proposed hybrid MT methodology is compared to established MT systems, both in terms of design concept and in terms of output quality. More specifically, the translation performance of the proposed methodology is evaluated against that of existing MT systems. The chapter summarises implementation decisions, using the Greek-to-English language pair as a test case. In addition, the detailed comparison of PRESEMT to other established MT systems provides insight on their relative advantages and disadvantages, focusing on specific translation tasks and addressing both translation quality as well as translation consistency and stability. Finally, directions are discussed for improving the performance of PRESEMT. This will allow PRESEMT to move beyond the original requirements for an MT system for gisting, towards a high-performing general-purpose MT system.

References

  1. Black, P.E. 2005. Dictionary of algorithms and data structures. U.S. National Institute of Standards and Technology (NIST)Google Scholar
  2. Brown, P.F., S.A. Della Pietra, V.J. Della Pietra, and R.L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 19(2): 263–311.Google Scholar
  3. Carbonell, J., S. Steve Klein, D. Miller, M. Steinbaum, T. Grassiany, and J. Frey. 2006. Context-based machine translation. In Proceedings of the 7th AMTA Conference, 19–28. Cambridge, MA.Google Scholar
  4. Carl, M., M. Melero, T. Badia, V. Vandeghinste, P. Dirix, I. Schuurman, S. Markantonatou, S. Sofianopoulos, M. Vassiliou, and O. Yannoutsou. 2008. METIS-II: Low resources machine translation: background, implementation, results and potentials. Machine Translation 22(1–2): 67–99.CrossRefGoogle Scholar
  5. Collins, M., P. Koehn, and I. Kucerova. 2005. Clause re-structuring for statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol. 43, 531.Google Scholar
  6. Costa-jussà, M. Ruiz, R. Banchs, R. Rapp, P. Lambert, K. Eberle, and B. Babych. 2013. Workshop on hybrid approaches to translation: overview and developments. In Proceedings of the 2nd HYTRA Workshop, held within ACL-2013, 1–6. Sofia.Google Scholar
  7. Denkowski, M., and A. Lavie. 2011. Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In: Proceedings of the EMNLP 2011 Workshop on Statistical Machine Translation, 85–91. Edinburgh.Google Scholar
  8. Dologlou, I., S. Markantonatou, G. Tambouratzis, O. Yannoutsou, A. Fourla, and N. Ioannou. 2003. Using monolingual corpora for statistical machine translation: the METIS system. In Proceedings of the EAMT-CLAW 2003 Workshop. 61–68. Dublin.Google Scholar
  9. Duda, R.O., P.E. Hart, and D.G. Stork. 2001. Pattern classification, 2nd edn. New York: Wiley.Google Scholar
  10. Eisele, A., C. Federmann, H. Uszkoreit, H. Saint-Amand, M. Kay, M. Jellinghaus, S. Hunsicker, T. Herrmann, and Y. Chen. 2008. Hybrid machine translation architectures within and beyond the EuroMatrix project. In European Machine Translation Conference. Hamburg.Google Scholar
  11. Gale, D., and L.S. Shapley. 1962. College admissions and the stability of marriage. American Mathematical Monthly 69: 9–14.MathSciNetCrossRefMATHGoogle Scholar
  12. Gough, N., and A. Way. 2004. Robust large-scale EBMT with marker-based segmentation. In Proceedings of the 10th Conference on Theoretical and Methodological Issues in Machine Translation (TMI-04), 95–104. Baltimore, MD.Google Scholar
  13. Hutchins, J. 2005. Example-based machine translation: A review and commentary. Machine Translation 19: 197–211.CrossRefGoogle Scholar
  14. Klementiev, A., A. Irvine, C. Callison-Burch, and D. Yarowsky. 2012. Toward statistical machine translation without parallel corpora. In Proceedings of EACL 2012, 130–140. AvignonGoogle Scholar
  15. Koehn, P. 2010. Statistical machine translation. Cambridge: Cambridge University Press.MATHGoogle Scholar
  16. Kuhn, H.W. 1955. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly 2: 83–97.MathSciNetCrossRefMATHGoogle Scholar
  17. Lafferty, J., A. McCallum, and F.C.N. Pereira. 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML ‘01), 282–289. San Francisco: Morgan Kaufmann.Google Scholar
  18. Levenshtein, V.I. 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10: 707–710.MathSciNetMATHGoogle Scholar
  19. Mairson, H. 1992. The stable marriage problem. The Brandeis Review 12: 1. Available at: http://www.cs.columbia.edu/~evs/intro/stable/writeup.html.Google Scholar
  20. Markantonatou, S., S. Sofianopoulos, O. Giannoutsou, and M. Vassiliou 2009. Hybrid machine translation for low- and middle- density languages. In Language engineering for lesser-studied languages, eds. S. Nirenburg, 243–274. Amsterdam: IOS Press.Google Scholar
  21. Munkres, J. 1957. Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics 5: 32–38.MathSciNetCrossRefMATHGoogle Scholar
  22. NIST. 2002. Automatic Evaluation of Machine Translation Quality Using n-gram Co-occurrences Statistics (Report). Available at: http://www.itl.nist.gov/iad/mig/tests/mt/doc/ngram-study.pdf
  23. Papineni, K., S. Roukos, T. Ward, and W.J. Zhu 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 311–318. Philadelphia.Google Scholar
  24. Popovic, M., and H. Ney 2006. POS-based word reorderings for statistical machine translation. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC2006), 1278–1283. Genoa.Google Scholar
  25. Prokopidis, P., B. Georgantopoulos, and H. Papageorgiou 2011. A suite of NLP tools for Greek. In Proceedings of the 10th ICGL Conference, 373–383. Komotini.Google Scholar
  26. Quirk, C., and A. Menezes. 2006. Dependency Treelet translation: The convergence of statistical and example-based machine translation? Machine Translation 20: 43–65.CrossRefGoogle Scholar
  27. Rottmann, K., and S. Vogel. 2007. Word reordering in statistical machine translation with a POS-based distortion model. In Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI 2007), 171–180. Skövde.Google Scholar
  28. Schmid, H. 1994. Probabilistic part-of-speech tagging using decision trees. In Proceedings of International Conference on New Methods in Language Processing, 44–49. Manchester.Google Scholar
  29. Smith, T.F., and M.S. Waterman. 1981. Identification of common molecular subsequences. Journal of Molecular Biology 147: 195–197.CrossRefGoogle Scholar
  30. Snover, M., B. Dorr, R. Schwartz, L. Micciulla, and J. Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the 7th AMTA Conference, 223–231. Cambridge, MA.Google Scholar
  31. Sofianopoulos, S., M. Vassiliou, and G. Tambouratzis. 2012. Implementing a language-independent MT methodology. In Proceedings of the 1st Workshop on Multilingual Modeling (held within ACL-2012), 1–10. JejuGoogle Scholar
  32. Su, J., H. Wu, H. Wang, Y. Chen, X. Shi, H. Dong, and Q. Liu. 2012. Translation model adaptation for statistical machine translation with monolingual topic information. In Proceedings of the ACL2012, 459–468. Jeju.Google Scholar
  33. Tambouratzis, G., F. Simistira, S. Sofianopoulos, N. Tsimboukakis, and M. Vassiliou. 2011. A resource-light phrase scheme for language-portable MT. In Proceedings of the 15th International Conference of the European Association For Machine Translation, eds. M. L. Forcada, H. Depraetere, and V. Vandeghinste, 185–192. Leuven.Google Scholar
  34. Tambouratzis, G., M. Troullinos, S. Sofianopoulos, and M. Vassiliou. 2012. Accurate phrase alignment in a bilingual corpus for EBMT systems. In Proceedings of the 5th BUCC Workshop, held within the LREC2012 Conference, 104–111. Istanbul.Google Scholar
  35. Tambouratzis, G., S. Sofianopoulos, and M. Vassiliou. 2013. Language-independent hybrid MT with PRESEMT. In Proceedings of HYTRA-2013 Workshop, held within the ACL-2013 Conference, 123–130. Sofia (ISBN 978-1-937284-53-4).Google Scholar
  36. Tambouratzis, G. 2014. Comparing CRF and template-matching in phrasing tasks within a Hybrid MT system. In Proceedings of the 3rd Workshop on Hybrid Approaches to Translation (held within the EACL-2014 Conference), 7–14. Gothenburg.Google Scholar
  37. Wu, D. 2005. MT model space: statistical versus compositional versus example-based machine translation. Machine Translation 19: 213–227.CrossRefGoogle Scholar
  38. Wu, D. 2009. Toward machine translation with statistics and syntax and semantics. In Proceedings of the IEEE Workshop On Automatic Speech Recognition & Understanding, 12–21. Merano.Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • George Tambouratzis
    • 1
  • Marina Vassiliou
    • 1
  • Sokratis Sofianopoulos
    • 1
  1. 1.ILSP, Athens R.C.MarousiGreece

Personalised recommendations