OpenMaTrEx: A Free/Open-Source Marker-Driven Example-Based Machine Translation System

  • Sandipan Dandapat
  • Mikel L. Forcada
  • Declan Groves
  • Sergio Penkale
  • John Tinsley
  • Andy Way
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6233)

Abstract

We describe OpenMaTrEx, a free/open-source example-based machine translation (EBMT) system based on the marker hypothesis, comprising a marker-driven chunker, a collection of chunk aligners, and two engines: one based on a simple proof-of-concept monotone EBMT recombinator and a Moses-based statistical decoder. OpenMaTrEx is a free/open-source release of the basic components of MaTrEx, the Dublin City University machine translation system.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Green, T.: The necessity of syntax markers. two experiments with artificial languages. Journal of Verbal Learning and Behavior 18, 481–496 (1979)CrossRefGoogle Scholar
  2. 2.
    Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open source toolkit for statistical machine translation. In: Ann. Meeting of the Association for Computational Linguistics (ACL), demonstration session, Prague, Czech Republic, pp. 177–180 (June 2007)Google Scholar
  3. 3.
    Stroppa, N., Way, A.: MaTrEx: DCU machine translation system for IWSLT 2006. In: Proceedings of IWSLT 2006, pp. 31–36 (2006)Google Scholar
  4. 4.
    Stroppa, N., Groves, D., Way, A., Sarasola, K.: Example-based machine translation of the Basque language. In: Proc. of AMTA 2006, Cambridge, MA, USA, pp. 232–241 (2006)Google Scholar
  5. 5.
    Groves, D., Way, A.: Hybrid example-based SMT: the best of both worlds? In: ACL-2005 Workshop on Building and Using Parallel Texts: Data-Driven Machine Translation and Beyond, vol. 100, pp. 183–190 (2005)Google Scholar
  6. 6.
    Hassan, H., Ma, Y., Way, A., Dublin, I.: MaTrEx: the DCU machine translation system for IWSLT 2007. In: Proc. of IWSLT 2007, Trento, Italy, pp. 69–75 (2007)Google Scholar
  7. 7.
    Tinsley, J., Ma, Y., Ozdowska, S., Way, A.: MaTrEx: the DCU MT system for WMT 2008. In: Proc. of the Third Workshop on Statistical Machine Translation, Waikiki, HI, pp. 171–174 (2008)Google Scholar
  8. 8.
    Phillips, A.B., Brown, R.D.: Cunei machine translation platform: System description. In: Proc. of the 3rd Workshop on Example-Based Machine Translation, Dublin, Ireland, pp. 29–36 (November 2009)Google Scholar
  9. 9.
    Tyers, F.M., Forcada, M.L., Ramírez-Sánchez, G.: The Apertium machine translation platform: Five years on. In: Proc. of the First Intl. Workshop on Free/Open-Source Rule-Based Machine Translation, Alacant, Spain, November 2009, pp. 3–10 (2009)Google Scholar
  10. 10.
    Groves, D., Way, A.: Hybridity in MT: Experiments on the Europarl corpus. In: Proc. of the 11th Ann. Conf. of the European Association for Machine Translation (EAMT-2006), Oslo, Norway, pp. 115–124 (2006)Google Scholar
  11. 11.
    van den Bosch, A., Stroppa, N., Way, A.: A memory-based classification approach to marker-based EBMT. In: Proc. of the METIS-II Workshop on New Approaches to Machine Translation, Leuven, Belgium, pp. 63–72 (2007)Google Scholar
  12. 12.
    Sánchez-Martínez, F., Forcada, M.L., Way, A.: Hybrid rule-based – example-based MT: Feeding Apertium with sub-sentential translation units. In: Proc. of the 3rd Workshop on Example-Based Machine Translation, Dublin, Ireland, pp. 11–18 (November 2009)Google Scholar
  13. 13.
    Sánchez-Martínez, F., Way, A.: Marker-based filtering of bilingual phrase pairs for SMT. In: Proc. of EAMT 2009, the 13th Ann. Meeting of the European Association for Machine Translation, Barcelona, Spain, pp. 144–151 (2009)Google Scholar
  14. 14.
    Och, F.J.: Minimum error rate training in statistical machine translation. In: Proc. 41st Ann. Meeting of the Association for Computational Linguistics, Sapporo, Japan, vol. 1, pp. 160–167 (2003)Google Scholar
  15. 15.
    Koehn, P., Axelrod, A., Mayne, A.B., Callison-Burch, C., Osborne, M., Talbot, D.: Edinburgh system description for the 2005 IWSLT speech translation evaluation. In: Proc. of IWSLT 2005, Pittsburgh, PA (2005)Google Scholar
  16. 16.
    Srivastava, A., Penkale, S., Groves, D., Tinsley, J.: Evaluating syntax-driven approaches to phrase extraction for MT. In: Proc. of the 3rd Workshop on Example-Based Machine Translation, Dublin, Ireland, pp. 19–28 (November 2009)Google Scholar
  17. 17.
    Federico, M., Cettolo, M.: Efficient handling of n-gram language models for statistical machine translation. In: Proc. of the 2nd Workshop on Statistical Machine Translation, Prague, Czech Rep., pp. 88–95 (2007)Google Scholar
  18. 18.
    Koehn, P.: Statistical significance tests for machine translation evaluation. In: Proceedings of EMNLP, vol. 4, pp. 388–395 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Sandipan Dandapat
    • 1
  • Mikel L. Forcada
    • 1
    • 2
  • Declan Groves
    • 1
    • 3
  • Sergio Penkale
    • 1
  • John Tinsley
    • 1
  • Andy Way
    • 1
  1. 1.Centre for Next Generation Localisation, School of ComputingDublin City UniversityGlasnevinIreland
  2. 2.Departament de Llenguatges i Sistemes InformàticsUniversitat d’AlacantAlacantSpain
  3. 3.Traslán TeorantaWicklowIreland

Personalised recommendations