Skip to main content

Language-Independent Hybrid MT: Comparative Evaluation of Translation Quality

  • Chapter
  • First Online:
Hybrid Approaches to Machine Translation

Abstract

The present chapter reviews the development of a hybrid Machine Translation (MT) methodology, which is readily portable to new language pairs. This MT methodology (which has been developed within the PRESEMT project) is based on sampling mainly monolingual corpora, with very limited use of parallel corpora, thus supporting portability to new language pairs. In designing this methodology, no assumptions are made regarding the availability of extensive and expensive-to-create linguistic resources. In addition, the general-purpose NLP tools used can be chosen interchangeably. Thus PRESEMT circumvents the requirement for specialised resources and tools so as to further support the creation of MT systems for diverse language pairs.

In the current chapter, the proposed hybrid MT methodology is compared to established MT systems, both in terms of design concept and in terms of output quality. More specifically, the translation performance of the proposed methodology is evaluated against that of existing MT systems. The chapter summarises implementation decisions, using the Greek-to-English language pair as a test case. In addition, the detailed comparison of PRESEMT to other established MT systems provides insight on their relative advantages and disadvantages, focusing on specific translation tasks and addressing both translation quality as well as translation consistency and stability. Finally, directions are discussed for improving the performance of PRESEMT. This will allow PRESEMT to move beyond the original requirements for an MT system for gisting, towards a high-performing general-purpose MT system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    www.presemt.eu

  2. 2.

    http://www.presemt.eu/files/Dels/PRESEMT_D9.2_supplement.pdf

References

  • Black, P.E. 2005. Dictionary of algorithms and data structures. U.S. National Institute of Standards and Technology (NIST)

    Google Scholar 

  • Brown, P.F., S.A. Della Pietra, V.J. Della Pietra, and R.L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 19(2): 263–311.

    Google Scholar 

  • Carbonell, J., S. Steve Klein, D. Miller, M. Steinbaum, T. Grassiany, and J. Frey. 2006. Context-based machine translation. In Proceedings of the 7th AMTA Conference, 19–28. Cambridge, MA.

    Google Scholar 

  • Carl, M., M. Melero, T. Badia, V. Vandeghinste, P. Dirix, I. Schuurman, S. Markantonatou, S. Sofianopoulos, M. Vassiliou, and O. Yannoutsou. 2008. METIS-II: Low resources machine translation: background, implementation, results and potentials. Machine Translation 22(1–2): 67–99.

    Article  Google Scholar 

  • Collins, M., P. Koehn, and I. Kucerova. 2005. Clause re-structuring for statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol. 43, 531.

    Google Scholar 

  • Costa-jussà, M. Ruiz, R. Banchs, R. Rapp, P. Lambert, K. Eberle, and B. Babych. 2013. Workshop on hybrid approaches to translation: overview and developments. In Proceedings of the 2nd HYTRA Workshop, held within ACL-2013, 1–6. Sofia.

    Google Scholar 

  • Denkowski, M., and A. Lavie. 2011. Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In: Proceedings of the EMNLP 2011 Workshop on Statistical Machine Translation, 85–91. Edinburgh.

    Google Scholar 

  • Dologlou, I., S. Markantonatou, G. Tambouratzis, O. Yannoutsou, A. Fourla, and N. Ioannou. 2003. Using monolingual corpora for statistical machine translation: the METIS system. In Proceedings of the EAMT-CLAW 2003 Workshop. 61–68. Dublin.

    Google Scholar 

  • Duda, R.O., P.E. Hart, and D.G. Stork. 2001. Pattern classification, 2nd edn. New York: Wiley.

    Google Scholar 

  • Eisele, A., C. Federmann, H. Uszkoreit, H. Saint-Amand, M. Kay, M. Jellinghaus, S. Hunsicker, T. Herrmann, and Y. Chen. 2008. Hybrid machine translation architectures within and beyond the EuroMatrix project. In European Machine Translation Conference. Hamburg.

    Google Scholar 

  • Gale, D., and L.S. Shapley. 1962. College admissions and the stability of marriage. American Mathematical Monthly 69: 9–14.

    Article  MathSciNet  MATH  Google Scholar 

  • Gough, N., and A. Way. 2004. Robust large-scale EBMT with marker-based segmentation. In Proceedings of the 10th Conference on Theoretical and Methodological Issues in Machine Translation (TMI-04), 95–104. Baltimore, MD.

    Google Scholar 

  • Hutchins, J. 2005. Example-based machine translation: A review and commentary. Machine Translation 19: 197–211.

    Article  Google Scholar 

  • Klementiev, A., A. Irvine, C. Callison-Burch, and D. Yarowsky. 2012. Toward statistical machine translation without parallel corpora. In Proceedings of EACL 2012, 130–140. Avignon

    Google Scholar 

  • Koehn, P. 2010. Statistical machine translation. Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Kuhn, H.W. 1955. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly 2: 83–97.

    Article  MathSciNet  MATH  Google Scholar 

  • Lafferty, J., A. McCallum, and F.C.N. Pereira. 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML ‘01), 282–289. San Francisco: Morgan Kaufmann.

    Google Scholar 

  • Levenshtein, V.I. 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10: 707–710.

    MathSciNet  MATH  Google Scholar 

  • Mairson, H. 1992. The stable marriage problem. The Brandeis Review 12: 1. Available at: http://www.cs.columbia.edu/~evs/intro/stable/writeup.html.

    Google Scholar 

  • Markantonatou, S., S. Sofianopoulos, O. Giannoutsou, and M. Vassiliou 2009. Hybrid machine translation for low- and middle- density languages. In Language engineering for lesser-studied languages, eds. S. Nirenburg, 243–274. Amsterdam: IOS Press.

    Google Scholar 

  • Munkres, J. 1957. Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics 5: 32–38.

    Article  MathSciNet  MATH  Google Scholar 

  • NIST. 2002. Automatic Evaluation of Machine Translation Quality Using n-gram Co-occurrences Statistics (Report). Available at: http://www.itl.nist.gov/iad/mig/tests/mt/doc/ngram-study.pdf

  • Papineni, K., S. Roukos, T. Ward, and W.J. Zhu 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 311–318. Philadelphia.

    Google Scholar 

  • Popovic, M., and H. Ney 2006. POS-based word reorderings for statistical machine translation. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC2006), 1278–1283. Genoa.

    Google Scholar 

  • Prokopidis, P., B. Georgantopoulos, and H. Papageorgiou 2011. A suite of NLP tools for Greek. In Proceedings of the 10th ICGL Conference, 373–383. Komotini.

    Google Scholar 

  • Quirk, C., and A. Menezes. 2006. Dependency Treelet translation: The convergence of statistical and example-based machine translation? Machine Translation 20: 43–65.

    Article  Google Scholar 

  • Rottmann, K., and S. Vogel. 2007. Word reordering in statistical machine translation with a POS-based distortion model. In Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI 2007), 171–180. Skövde.

    Google Scholar 

  • Schmid, H. 1994. Probabilistic part-of-speech tagging using decision trees. In Proceedings of International Conference on New Methods in Language Processing, 44–49. Manchester.

    Google Scholar 

  • Smith, T.F., and M.S. Waterman. 1981. Identification of common molecular subsequences. Journal of Molecular Biology 147: 195–197.

    Article  Google Scholar 

  • Snover, M., B. Dorr, R. Schwartz, L. Micciulla, and J. Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the 7th AMTA Conference, 223–231. Cambridge, MA.

    Google Scholar 

  • Sofianopoulos, S., M. Vassiliou, and G. Tambouratzis. 2012. Implementing a language-independent MT methodology. In Proceedings of the 1st Workshop on Multilingual Modeling (held within ACL-2012), 1–10. Jeju

    Google Scholar 

  • Su, J., H. Wu, H. Wang, Y. Chen, X. Shi, H. Dong, and Q. Liu. 2012. Translation model adaptation for statistical machine translation with monolingual topic information. In Proceedings of the ACL2012, 459–468. Jeju.

    Google Scholar 

  • Tambouratzis, G., F. Simistira, S. Sofianopoulos, N. Tsimboukakis, and M. Vassiliou. 2011. A resource-light phrase scheme for language-portable MT. In Proceedings of the 15th International Conference of the European Association For Machine Translation, eds. M. L. Forcada, H. Depraetere, and V. Vandeghinste, 185–192. Leuven.

    Google Scholar 

  • Tambouratzis, G., M. Troullinos, S. Sofianopoulos, and M. Vassiliou. 2012. Accurate phrase alignment in a bilingual corpus for EBMT systems. In Proceedings of the 5th BUCC Workshop, held within the LREC2012 Conference, 104–111. Istanbul.

    Google Scholar 

  • Tambouratzis, G., S. Sofianopoulos, and M. Vassiliou. 2013. Language-independent hybrid MT with PRESEMT. In Proceedings of HYTRA-2013 Workshop, held within the ACL-2013 Conference, 123–130. Sofia (ISBN 978-1-937284-53-4).

    Google Scholar 

  • Tambouratzis, G. 2014. Comparing CRF and template-matching in phrasing tasks within a Hybrid MT system. In Proceedings of the 3rd Workshop on Hybrid Approaches to Translation (held within the EACL-2014 Conference), 7–14. Gothenburg.

    Google Scholar 

  • Wu, D. 2005. MT model space: statistical versus compositional versus example-based machine translation. Machine Translation 19: 213–227.

    Article  Google Scholar 

  • Wu, D. 2009. Toward machine translation with statistics and syntax and semantics. In Proceedings of the IEEE Workshop On Automatic Speech Recognition & Understanding, 12–21. Merano.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to George Tambouratzis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Tambouratzis, G., Vassiliou, M., Sofianopoulos, S. (2016). Language-Independent Hybrid MT: Comparative Evaluation of Translation Quality. In: Costa-jussà, M., Rapp, R., Lambert, P., Eberle, K., Banchs, R., Babych, B. (eds) Hybrid Approaches to Machine Translation. Theory and Applications of Natural Language Processing. Springer, Cham. https://doi.org/10.1007/978-3-319-21311-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-21311-8_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-21310-1

  • Online ISBN: 978-3-319-21311-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics