Abstract
The present chapter reviews the development of a hybrid Machine Translation (MT) methodology, which is readily portable to new language pairs. This MT methodology (which has been developed within the PRESEMT project) is based on sampling mainly monolingual corpora, with very limited use of parallel corpora, thus supporting portability to new language pairs. In designing this methodology, no assumptions are made regarding the availability of extensive and expensive-to-create linguistic resources. In addition, the general-purpose NLP tools used can be chosen interchangeably. Thus PRESEMT circumvents the requirement for specialised resources and tools so as to further support the creation of MT systems for diverse language pairs.
In the current chapter, the proposed hybrid MT methodology is compared to established MT systems, both in terms of design concept and in terms of output quality. More specifically, the translation performance of the proposed methodology is evaluated against that of existing MT systems. The chapter summarises implementation decisions, using the Greek-to-English language pair as a test case. In addition, the detailed comparison of PRESEMT to other established MT systems provides insight on their relative advantages and disadvantages, focusing on specific translation tasks and addressing both translation quality as well as translation consistency and stability. Finally, directions are discussed for improving the performance of PRESEMT. This will allow PRESEMT to move beyond the original requirements for an MT system for gisting, towards a high-performing general-purpose MT system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Black, P.E. 2005. Dictionary of algorithms and data structures. U.S. National Institute of Standards and Technology (NIST)
Brown, P.F., S.A. Della Pietra, V.J. Della Pietra, and R.L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 19(2): 263–311.
Carbonell, J., S. Steve Klein, D. Miller, M. Steinbaum, T. Grassiany, and J. Frey. 2006. Context-based machine translation. In Proceedings of the 7th AMTA Conference, 19–28. Cambridge, MA.
Carl, M., M. Melero, T. Badia, V. Vandeghinste, P. Dirix, I. Schuurman, S. Markantonatou, S. Sofianopoulos, M. Vassiliou, and O. Yannoutsou. 2008. METIS-II: Low resources machine translation: background, implementation, results and potentials. Machine Translation 22(1–2): 67–99.
Collins, M., P. Koehn, and I. Kucerova. 2005. Clause re-structuring for statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol. 43, 531.
Costa-jussà , M. Ruiz, R. Banchs, R. Rapp, P. Lambert, K. Eberle, and B. Babych. 2013. Workshop on hybrid approaches to translation: overview and developments. In Proceedings of the 2nd HYTRA Workshop, held within ACL-2013, 1–6. Sofia.
Denkowski, M., and A. Lavie. 2011. Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In: Proceedings of the EMNLP 2011 Workshop on Statistical Machine Translation, 85–91. Edinburgh.
Dologlou, I., S. Markantonatou, G. Tambouratzis, O. Yannoutsou, A. Fourla, and N. Ioannou. 2003. Using monolingual corpora for statistical machine translation: the METIS system. In Proceedings of the EAMT-CLAW 2003 Workshop. 61–68. Dublin.
Duda, R.O., P.E. Hart, and D.G. Stork. 2001. Pattern classification, 2nd edn. New York: Wiley.
Eisele, A., C. Federmann, H. Uszkoreit, H. Saint-Amand, M. Kay, M. Jellinghaus, S. Hunsicker, T. Herrmann, and Y. Chen. 2008. Hybrid machine translation architectures within and beyond the EuroMatrix project. In European Machine Translation Conference. Hamburg.
Gale, D., and L.S. Shapley. 1962. College admissions and the stability of marriage. American Mathematical Monthly 69: 9–14.
Gough, N., and A. Way. 2004. Robust large-scale EBMT with marker-based segmentation. In Proceedings of the 10th Conference on Theoretical and Methodological Issues in Machine Translation (TMI-04), 95–104. Baltimore, MD.
Hutchins, J. 2005. Example-based machine translation: A review and commentary. Machine Translation 19: 197–211.
Klementiev, A., A. Irvine, C. Callison-Burch, and D. Yarowsky. 2012. Toward statistical machine translation without parallel corpora. In Proceedings of EACL 2012, 130–140. Avignon
Koehn, P. 2010. Statistical machine translation. Cambridge: Cambridge University Press.
Kuhn, H.W. 1955. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly 2: 83–97.
Lafferty, J., A. McCallum, and F.C.N. Pereira. 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML ‘01), 282–289. San Francisco: Morgan Kaufmann.
Levenshtein, V.I. 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10: 707–710.
Mairson, H. 1992. The stable marriage problem. The Brandeis Review 12: 1. Available at: http://www.cs.columbia.edu/~evs/intro/stable/writeup.html.
Markantonatou, S., S. Sofianopoulos, O. Giannoutsou, and M. Vassiliou 2009. Hybrid machine translation for low- and middle- density languages. In Language engineering for lesser-studied languages, eds. S. Nirenburg, 243–274. Amsterdam: IOS Press.
Munkres, J. 1957. Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics 5: 32–38.
NIST. 2002. Automatic Evaluation of Machine Translation Quality Using n-gram Co-occurrences Statistics (Report). Available at: http://www.itl.nist.gov/iad/mig/tests/mt/doc/ngram-study.pdf
Papineni, K., S. Roukos, T. Ward, and W.J. Zhu 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 311–318. Philadelphia.
Popovic, M., and H. Ney 2006. POS-based word reorderings for statistical machine translation. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC2006), 1278–1283. Genoa.
Prokopidis, P., B. Georgantopoulos, and H. Papageorgiou 2011. A suite of NLP tools for Greek. In Proceedings of the 10th ICGL Conference, 373–383. Komotini.
Quirk, C., and A. Menezes. 2006. Dependency Treelet translation: The convergence of statistical and example-based machine translation? Machine Translation 20: 43–65.
Rottmann, K., and S. Vogel. 2007. Word reordering in statistical machine translation with a POS-based distortion model. In Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI 2007), 171–180. Skövde.
Schmid, H. 1994. Probabilistic part-of-speech tagging using decision trees. In Proceedings of International Conference on New Methods in Language Processing, 44–49. Manchester.
Smith, T.F., and M.S. Waterman. 1981. Identification of common molecular subsequences. Journal of Molecular Biology 147: 195–197.
Snover, M., B. Dorr, R. Schwartz, L. Micciulla, and J. Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the 7th AMTA Conference, 223–231. Cambridge, MA.
Sofianopoulos, S., M. Vassiliou, and G. Tambouratzis. 2012. Implementing a language-independent MT methodology. In Proceedings of the 1st Workshop on Multilingual Modeling (held within ACL-2012), 1–10. Jeju
Su, J., H. Wu, H. Wang, Y. Chen, X. Shi, H. Dong, and Q. Liu. 2012. Translation model adaptation for statistical machine translation with monolingual topic information. In Proceedings of the ACL2012, 459–468. Jeju.
Tambouratzis, G., F. Simistira, S. Sofianopoulos, N. Tsimboukakis, and M. Vassiliou. 2011. A resource-light phrase scheme for language-portable MT. In Proceedings of the 15th International Conference of the European Association For Machine Translation, eds. M. L. Forcada, H. Depraetere, and V. Vandeghinste, 185–192. Leuven.
Tambouratzis, G., M. Troullinos, S. Sofianopoulos, and M. Vassiliou. 2012. Accurate phrase alignment in a bilingual corpus for EBMT systems. In Proceedings of the 5th BUCC Workshop, held within the LREC2012 Conference, 104–111. Istanbul.
Tambouratzis, G., S. Sofianopoulos, and M. Vassiliou. 2013. Language-independent hybrid MT with PRESEMT. In Proceedings of HYTRA-2013 Workshop, held within the ACL-2013 Conference, 123–130. Sofia (ISBN 978-1-937284-53-4).
Tambouratzis, G. 2014. Comparing CRF and template-matching in phrasing tasks within a Hybrid MT system. In Proceedings of the 3rd Workshop on Hybrid Approaches to Translation (held within the EACL-2014 Conference), 7–14. Gothenburg.
Wu, D. 2005. MT model space: statistical versus compositional versus example-based machine translation. Machine Translation 19: 213–227.
Wu, D. 2009. Toward machine translation with statistics and syntax and semantics. In Proceedings of the IEEE Workshop On Automatic Speech Recognition & Understanding, 12–21. Merano.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Tambouratzis, G., Vassiliou, M., Sofianopoulos, S. (2016). Language-Independent Hybrid MT: Comparative Evaluation of Translation Quality. In: Costa-jussà , M., Rapp, R., Lambert, P., Eberle, K., Banchs, R., Babych, B. (eds) Hybrid Approaches to Machine Translation. Theory and Applications of Natural Language Processing. Springer, Cham. https://doi.org/10.1007/978-3-319-21311-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-21311-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21310-1
Online ISBN: 978-3-319-21311-8
eBook Packages: Computer ScienceComputer Science (R0)