Scratching the Surface of Possible Translations

  • Ondřej Bojar
  • Matouš Macháček
  • Aleš Tamchyna
  • Daniel Zeman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8082)


One of the key obstacles in automatic evaluation of machine translation systems is the reliance on a few (typically just one) human-made reference translations to which the system output is compared. We propose a method of capturing millions of possible translations and implement a tool for translators to specify them using a compact representation. We evaluate this new type of reference set by edit distance and correlation to human judgements of translation quality.


machine translation evaluation reference translations 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Miller, G.A.: WordNet: A lexical database for English. Commun. ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  2. 2.
    Pala, K., Čapek, T., Zajíčková, B., et al.: Český WordNet 1.9 PDT (2010)Google Scholar
  3. 3.
    Bannard, C., Callison-Burch, C.: Paraphrasing with bilingual parallel corpora. In: Proc. of ACL, Ann Arbor, Michigan, USA, pp. 597–604 (2005)Google Scholar
  4. 4.
    Kauchak, D., Barzilay, R.: Paraphrasing for Automatic Evaluation. In: Proc. of NAACL/HLT, New York City, USA, pp. 455–462 (2006)Google Scholar
  5. 5.
    Denkowski, M., Lavie, A.: Meteor-next and the meteor paraphrase tables: Improved evaluation support for five target languages. In: Proc. of WMT and MetricsMATR, pp. 339–342. ACL, Uppsala (2010)Google Scholar
  6. 6.
    Androutsopoulos, I., Malakasiotis, P.: A survey of paraphrasing and textual entailment methods. Journal of Artificial Intelligence Research 38, 135–187 (2010)zbMATHGoogle Scholar
  7. 7.
    Callison-Burch, C., Koehn, P., Monz, C., et al.: Findings of the 2012 Workshop on Statistical Machine Translation. In: Proc. of WMT, pp. 22–64. ACL, Montréal (2012)Google Scholar
  8. 8.
    Bojar, O., Kos, K.: 2010 Failures in English-Czech Phrase-Based MT. In: Proc. WMT and MetricsMATR, pp. 60–66. ACL, Uppsala (2010)Google Scholar
  9. 9.
    Bojar, O., Kos, K., Mareček, D.: Tackling Sparse Data Issue in Machine Translation Evaluation. In: Proc. of ACL Short Papers, pp. 86–91. ACL, Uppsala (2010)Google Scholar
  10. 10.
    Dreyer, M., Marcu, D.: HyTER: Meaning-Equivalent Semantics for Translation Evaluation. In: Proc. of NAACL/HLT, Montréal, Canada, pp. 162–171 (2012)Google Scholar
  11. 11.
    Woods, W.A.: Transition network grammars for natural language analysis. Commun. ACM 13(10), 591–606 (1970), doi:10.1145/355598.362773zbMATHCrossRefGoogle Scholar
  12. 12.
    Callison-Burch, C., Koehn, P., Monz, C., Zaidan, O.: Findings of the 2011 Workshop on Statistical Machine Translation. In: Proc. of WMT, pp. 22–64. ACL (2011)Google Scholar
  13. 13.
    Bojar, O., Ercegovčević, M., Popel, M., Zaidan, O.: A Grain of Salt for the WMT Manual Evaluation. In: Proc. of WMT, pp. 1–11. ACL, Edinburgh (2011)Google Scholar
  14. 14.
    Bojar, O., Zeman, D., Dušek, O.: Additional German-Czech reference translations of the WMT’11 test setGoogle Scholar
  15. 15.
    Levenshtein, V.: Binary codes capable of correcting deletions, insertions and reversals. In: Soviet Physics-Doklandy, vol. 10 (1966)Google Scholar
  16. 16.
    Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: A Method for Automatic Evaluation of Machine Translation. In: Proc. of ACL, Philadelphia, Pennsylvania, pp. 311–318 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Ondřej Bojar
    • 1
  • Matouš Macháček
    • 1
  • Aleš Tamchyna
    • 1
  • Daniel Zeman
    • 1
  1. 1.Faculty of Mathematics and Physics Institute of Formal and Applied LinguisticsCharles University in PragueCzech Republic

Personalised recommendations