Advertisement

Minimum Bayes Risk Decoding with Enlarged Hypothesis Space in System Combination

  • Tsuyoshi Okita
  • Josef van Genabith
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7182)

Abstract

This paper describes a new system combination strategy in Statistical Machine Translation. Tromble et al. (2008) introduced the evidence space into Minimum Bayes Risk decoding in order to quantify the relative performance within lattice or n-best output with regard to the 1-best output. In contrast, our approach is to enlarge the hypothesis space in order to incorporate the combinatorial nature of MBR decoding. In this setting, we perform experiments on three language pairs ES-EN, FR-EN and JP-EN. For ES-EN JRC-Acquis our approach shows 0.50 BLEU points absolute and 1.9% relative improvement obver the standard confusion network-based system combination without hypothesis expansion, and 2.16 BLEU points absolute and 9.2% relative improvement compared to the single best system. For JP-EN NTCIR-8 the improvement is 0.94 points absolute and 3.4% relative, and for FR-EN WMT09 0.30 points absolute and 1.3% relative compared to the single best system, respectively.

Keywords

Machine Translation Output Sequence Statistical Machine Translation Hypothesis Space Word Alignment 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Arun, A., Haddow, B., Koehn, P.: A unified approach to minimum risk training and decoding. In: Proceedings of Fifth Workshop on Statistical Machine Translation and MetricsMATR, pp. 365–374 (2010)Google Scholar
  2. 2.
    Bangalore, S., Bordel, G., Riccardi, G.: Computing consensus translation from multiple machine translation systems. In: Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 350–354 (2001)Google Scholar
  3. 3.
    Callison-Burch, C., Koehn, P., Monz, C., Schroeder, J.: Findings of the 2009 workshop on statistical machine translation. In: Proceedings of EACL Workshop on Statistical Machine Translation 2009, pp. 1–28 (2009)Google Scholar
  4. 4.
    DeNero, J., Chiang, D., Knight, K.: Fast consensus decoding over translation forests. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pp. 567–575 (2009)Google Scholar
  5. 5.
    DeNero, J., Kumar, S., Chelba, C., Och, F.: Model combination for machine translation. In: Proceedings of NAACL, pp. 975–983 (2010)Google Scholar
  6. 6.
    Du, J., He, Y., Penkale, S., Way, A.: MaTrEx: the DCU MT System for WMT 2009. In: Proceedings of the Third EACL Workshop on Statistical Machine Translation, pp. 95–99 (2009)Google Scholar
  7. 7.
    Federmann, C.: Ml4hmt workshop challenge at mt summit xiii. In: Proceedings of ML4HMT Workshop, pp. 110–117 (2011)Google Scholar
  8. 8.
    Fujii, A., Utiyama, M., Yamamoto, M., Utsuro, T., Ehara, T., Echizen-ya, H., Shimohata, S.: Overview of the patent translation task at the NTCIR-8 workshop. In: Proceedings of the 8th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, Question Answering and Cross-lingual Information Access, pp. 293–302 (2010)Google Scholar
  9. 9.
    Goel, V., Byrne, W.: Task dependent loss functions in speech recognition: A-star search over recognition lattices. In: Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH), pp. 51–80 (1999)Google Scholar
  10. 10.
    Koehn, P.: Statistical machine translation. Cambridge University Press (2010)Google Scholar
  11. 11.
    Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open source toolkit for Statistical Machine Translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, pp. 177–180 (2007)Google Scholar
  12. 12.
    Koehn, P., Och, F., Marcu, D.: Statistical phrase-based translation. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computationa Linguistics (HLT / NAACL 2003), pp. 115–124 (2003)Google Scholar
  13. 13.
    Koller, D., Friedman, N.: Probabilistic graphical models: Principles and techniques. MIT Press (2009)Google Scholar
  14. 14.
    Kumar, S., Byrne, W.: Minimum Bayes-Risk word alignment of bilingual texts. In: Proceedings of the Empirical Methods in Natural Language Processing (EMNLP 2002), pp. 140–147 (2002)Google Scholar
  15. 15.
    Leusch, G., Matusov, E., Ney, H.: The rwth system combination system for wmt 2009. In: Fourth EACL Workshop on Statistical Machine Translation (WMT 2009), pp. 56–60 (2009)Google Scholar
  16. 16.
    Matusov, E., Ueffing, N., Ney, H.: Computing consensus translation from multiple machine translation systems using enhanced hypotheses alignment. In: Proceedings of the 11st Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 33–40 (2006)Google Scholar
  17. 17.
    Och, F.: Minimum Error Rate Training in Statistical Machine Translation. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pp. 160–167 (2003)Google Scholar
  18. 18.
    Okita, T.: Word alignment and smoothing method in statistical machine translation: Noise, prior knowledge and overfitting. PhD thesis. Dublin City University, pp. 1–130 (2011)Google Scholar
  19. 19.
    Okita, T., Guerra, A.M., Graham, Y., Way, A.: Multi-Word Expression sensitive word alignment. In: Proceedings of the Fourth International Workshop On Cross Lingual Information Access (CLIA 2010, collocated with COLING 2010), Beijing, China, pp. 1–8 (2010)Google Scholar
  20. 20.
    Okita, T., Jiang, J., Haque, R., Al-Maghout, H., Du, J., Naskar, S.K., Way, A.: MaTrEx: the DCU MT System for NTCIR-8. In: Proceedings of the MII Test Collection for IR Systems-8 Meeting (NTCIR-8), Tokyo, pp. 377–383 (2010)Google Scholar
  21. 21.
    Okita, T., Way, A.: Given bilingual terminology in statistical machine translation: Mwe-sensitve word alignment and hierarchical pitman-yor process-based translation model smoothing. In: Proceedings of the 24th International Florida Artificial Intelligence Research Society Conference (FLAIRS-24), pp. 269–274 (2011)Google Scholar
  22. 22.
    Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: A Method For Automatic Evaluation of Machine Translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL 2002), pp. 311–318 (2002)Google Scholar
  23. 23.
    Pearl, J.: Reverend bayes on inference engines: A distributed hierarchical approach. In: Proceedings of the Second National Conference on Artificial Intelligence (AAAI 1982), pp. 133–136 (1983)Google Scholar
  24. 24.
    Shenoy, P., Shafer, G.: Axioms for probability and belief-function propagation. In: Proceedings of the 6th Conference of Uncertainty in Artificial Intelligence (UAI), pp. 169–198 (1990)Google Scholar
  25. 25.
    Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of Association for Machine Translation in the Americas, pp. 223–231 (2006)Google Scholar
  26. 26.
    Steinberger, R., Pouliquen, B., Widiger, A., Ignat, C., Erjavec, T., Tufis, D., Varga, D.: The jrc-acquis: A multilingual aligned parallel corpus with 20+ languages. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), pp. 2142–2147 (2006)Google Scholar
  27. 27.
    Stolcke, A.: SRILM – An extensible language modeling toolkit. In: Proceedings of the International Conference on Spoken Language Processing, pp. 901–904 (2002)Google Scholar
  28. 28.
    Tromble, R., Kumar, S., Och, F., Macherey, W.: Lattice minimum bayes-risk decoding for statistical machine translation. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 620–629 (2008)Google Scholar
  29. 29.
    Viterbi, A.J.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory 13, 260–269 (1967)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Tsuyoshi Okita
    • 1
  • Josef van Genabith
    • 1
  1. 1.School of ComputingDublin City UniversityDublin 9Ireland

Personalised recommendations