Language-Independent Hybrid MT: Comparative Evaluation of Translation Quality

Tambouratzis, George; Vassiliou, Marina; Sofianopoulos, Sokratis

doi:10.1007/978-3-319-21311-8_6

George Tambouratzis¹⁰,
Marina Vassiliou¹⁰ &
Sokratis Sofianopoulos¹⁰

Part of the book series: Theory and Applications of Natural Language Processing ((NLP))

990 Accesses
1 Citations

Abstract

The present chapter reviews the development of a hybrid Machine Translation (MT) methodology, which is readily portable to new language pairs. This MT methodology (which has been developed within the PRESEMT project) is based on sampling mainly monolingual corpora, with very limited use of parallel corpora, thus supporting portability to new language pairs. In designing this methodology, no assumptions are made regarding the availability of extensive and expensive-to-create linguistic resources. In addition, the general-purpose NLP tools used can be chosen interchangeably. Thus PRESEMT circumvents the requirement for specialised resources and tools so as to further support the creation of MT systems for diverse language pairs.

In the current chapter, the proposed hybrid MT methodology is compared to established MT systems, both in terms of design concept and in terms of output quality. More specifically, the translation performance of the proposed methodology is evaluated against that of existing MT systems. The chapter summarises implementation decisions, using the Greek-to-English language pair as a test case. In addition, the detailed comparison of PRESEMT to other established MT systems provides insight on their relative advantages and disadvantages, focusing on specific translation tasks and addressing both translation quality as well as translation consistency and stability. Finally, directions are discussed for improving the performance of PRESEMT. This will allow PRESEMT to move beyond the original requirements for an MT system for gisting, towards a high-performing general-purpose MT system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Black, P.E. 2005. Dictionary of algorithms and data structures. U.S. National Institute of Standards and Technology (NIST)
Google Scholar
Brown, P.F., S.A. Della Pietra, V.J. Della Pietra, and R.L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 19(2): 263–311.
Google Scholar
Carbonell, J., S. Steve Klein, D. Miller, M. Steinbaum, T. Grassiany, and J. Frey. 2006. Context-based machine translation. In Proceedings of the 7th AMTA Conference, 19–28. Cambridge, MA.
Google Scholar
Carl, M., M. Melero, T. Badia, V. Vandeghinste, P. Dirix, I. Schuurman, S. Markantonatou, S. Sofianopoulos, M. Vassiliou, and O. Yannoutsou. 2008. METIS-II: Low resources machine translation: background, implementation, results and potentials. Machine Translation 22(1–2): 67–99.
Article Google Scholar
Collins, M., P. Koehn, and I. Kucerova. 2005. Clause re-structuring for statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol. 43, 531.
Google Scholar
Costa-jussà, M. Ruiz, R. Banchs, R. Rapp, P. Lambert, K. Eberle, and B. Babych. 2013. Workshop on hybrid approaches to translation: overview and developments. In Proceedings of the 2nd HYTRA Workshop, held within ACL-2013, 1–6. Sofia.
Google Scholar
Denkowski, M., and A. Lavie. 2011. Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In: Proceedings of the EMNLP 2011 Workshop on Statistical Machine Translation, 85–91. Edinburgh.
Google Scholar
Dologlou, I., S. Markantonatou, G. Tambouratzis, O. Yannoutsou, A. Fourla, and N. Ioannou. 2003. Using monolingual corpora for statistical machine translation: the METIS system. In Proceedings of the EAMT-CLAW 2003 Workshop. 61–68. Dublin.
Google Scholar
Duda, R.O., P.E. Hart, and D.G. Stork. 2001. Pattern classification, 2nd edn. New York: Wiley.
Google Scholar
Eisele, A., C. Federmann, H. Uszkoreit, H. Saint-Amand, M. Kay, M. Jellinghaus, S. Hunsicker, T. Herrmann, and Y. Chen. 2008. Hybrid machine translation architectures within and beyond the EuroMatrix project. In European Machine Translation Conference. Hamburg.
Google Scholar
Gale, D., and L.S. Shapley. 1962. College admissions and the stability of marriage. American Mathematical Monthly 69: 9–14.
Article MathSciNet MATH Google Scholar
Gough, N., and A. Way. 2004. Robust large-scale EBMT with marker-based segmentation. In Proceedings of the 10th Conference on Theoretical and Methodological Issues in Machine Translation (TMI-04), 95–104. Baltimore, MD.
Google Scholar
Hutchins, J. 2005. Example-based machine translation: A review and commentary. Machine Translation 19: 197–211.
Article Google Scholar
Klementiev, A., A. Irvine, C. Callison-Burch, and D. Yarowsky. 2012. Toward statistical machine translation without parallel corpora. In Proceedings of EACL 2012, 130–140. Avignon
Google Scholar
Koehn, P. 2010. Statistical machine translation. Cambridge: Cambridge University Press.
MATH Google Scholar
Kuhn, H.W. 1955. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly 2: 83–97.
Article MathSciNet MATH Google Scholar
Lafferty, J., A. McCallum, and F.C.N. Pereira. 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML ‘01), 282–289. San Francisco: Morgan Kaufmann.
Google Scholar
Levenshtein, V.I. 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10: 707–710.
MathSciNet MATH Google Scholar
Mairson, H. 1992. The stable marriage problem. The Brandeis Review 12: 1. Available at: http://www.cs.columbia.edu/~evs/intro/stable/writeup.html.
Google Scholar
Markantonatou, S., S. Sofianopoulos, O. Giannoutsou, and M. Vassiliou 2009. Hybrid machine translation for low- and middle- density languages. In Language engineering for lesser-studied languages, eds. S. Nirenburg, 243–274. Amsterdam: IOS Press.
Google Scholar
Munkres, J. 1957. Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics 5: 32–38.
Article MathSciNet MATH Google Scholar
NIST. 2002. Automatic Evaluation of Machine Translation Quality Using n-gram Co-occurrences Statistics (Report). Available at: http://www.itl.nist.gov/iad/mig/tests/mt/doc/ngram-study.pdf
Papineni, K., S. Roukos, T. Ward, and W.J. Zhu 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 311–318. Philadelphia.
Google Scholar
Popovic, M., and H. Ney 2006. POS-based word reorderings for statistical machine translation. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC2006), 1278–1283. Genoa.
Google Scholar
Prokopidis, P., B. Georgantopoulos, and H. Papageorgiou 2011. A suite of NLP tools for Greek. In Proceedings of the 10th ICGL Conference, 373–383. Komotini.
Google Scholar
Quirk, C., and A. Menezes. 2006. Dependency Treelet translation: The convergence of statistical and example-based machine translation? Machine Translation 20: 43–65.
Article Google Scholar
Rottmann, K., and S. Vogel. 2007. Word reordering in statistical machine translation with a POS-based distortion model. In Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI 2007), 171–180. Skövde.
Google Scholar
Schmid, H. 1994. Probabilistic part-of-speech tagging using decision trees. In Proceedings of International Conference on New Methods in Language Processing, 44–49. Manchester.
Google Scholar
Smith, T.F., and M.S. Waterman. 1981. Identification of common molecular subsequences. Journal of Molecular Biology 147: 195–197.
Article Google Scholar
Snover, M., B. Dorr, R. Schwartz, L. Micciulla, and J. Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the 7th AMTA Conference, 223–231. Cambridge, MA.
Google Scholar
Sofianopoulos, S., M. Vassiliou, and G. Tambouratzis. 2012. Implementing a language-independent MT methodology. In Proceedings of the 1st Workshop on Multilingual Modeling (held within ACL-2012), 1–10. Jeju
Google Scholar
Su, J., H. Wu, H. Wang, Y. Chen, X. Shi, H. Dong, and Q. Liu. 2012. Translation model adaptation for statistical machine translation with monolingual topic information. In Proceedings of the ACL2012, 459–468. Jeju.
Google Scholar
Tambouratzis, G., F. Simistira, S. Sofianopoulos, N. Tsimboukakis, and M. Vassiliou. 2011. A resource-light phrase scheme for language-portable MT. In Proceedings of the 15th International Conference of the European Association For Machine Translation, eds. M. L. Forcada, H. Depraetere, and V. Vandeghinste, 185–192. Leuven.
Google Scholar
Tambouratzis, G., M. Troullinos, S. Sofianopoulos, and M. Vassiliou. 2012. Accurate phrase alignment in a bilingual corpus for EBMT systems. In Proceedings of the 5th BUCC Workshop, held within the LREC2012 Conference, 104–111. Istanbul.
Google Scholar
Tambouratzis, G., S. Sofianopoulos, and M. Vassiliou. 2013. Language-independent hybrid MT with PRESEMT. In Proceedings of HYTRA-2013 Workshop, held within the ACL-2013 Conference, 123–130. Sofia (ISBN 978-1-937284-53-4).
Google Scholar
Tambouratzis, G. 2014. Comparing CRF and template-matching in phrasing tasks within a Hybrid MT system. In Proceedings of the 3rd Workshop on Hybrid Approaches to Translation (held within the EACL-2014 Conference), 7–14. Gothenburg.
Google Scholar
Wu, D. 2005. MT model space: statistical versus compositional versus example-based machine translation. Machine Translation 19: 213–227.
Article Google Scholar
Wu, D. 2009. Toward machine translation with statistics and syntax and semantics. In Proceedings of the IEEE Workshop On Automatic Speech Recognition & Understanding, 12–21. Merano.
Google Scholar

Download references

Author information

Authors and Affiliations

ILSP, Athens R.C., 6 Artemidos & Epidavrou Str., Paradissos Amaroussiou, 151 25, Marousi, Greece
George Tambouratzis, Marina Vassiliou & Sokratis Sofianopoulos

Authors

George Tambouratzis
View author publications
You can also search for this author in PubMed Google Scholar
Marina Vassiliou
View author publications
You can also search for this author in PubMed Google Scholar
Sokratis Sofianopoulos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to George Tambouratzis .

Editor information

Editors and Affiliations

Universitat politècnica de catalunya , Barcelona, Spain
Marta R. Costa-jussà
University of Aix-Marseille and University of Mainz, Marseille, France
Reinhard Rapp
Pompeu Fabra University, Barcelona, Barcelona, Spain
Patrik Lambert
Lingenio GmbH, Heidelberg, Baden-Württemberg, Germany
Kurt Eberle
Institute for Infocomm Research, Singapore, Singapur, Singapore
Rafael E. Banchs
Centre for Translation Studies, University of Leeds School of Modern Languages&Cultures, Leeds, United Kingdom
Bogdan Babych

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tambouratzis, G., Vassiliou, M., Sofianopoulos, S. (2016). Language-Independent Hybrid MT: Comparative Evaluation of Translation Quality. In: Costa-jussà, M., Rapp, R., Lambert, P., Eberle, K., Banchs, R., Babych, B. (eds) Hybrid Approaches to Machine Translation. Theory and Applications of Natural Language Processing. Springer, Cham. https://doi.org/10.1007/978-3-319-21311-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-21311-8_6
Published: 13 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21310-1
Online ISBN: 978-3-319-21311-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics