Machine Translation

, Volume 26, Issue 4, pp 289–323 | Cite as

What types of word alignment improve statistical machine translation?

  • Patrik Lambert
  • Simon Petitrenaud
  • Yanjun Ma
  • Andy Way
Article

Abstract

In most statistical machine translation (SMT) systems, bilingual segments are extracted via word alignment. However, there is a need for systematic study as to what alignment characteristics can benefit MT under specific experimental settings such as the type of MT system, the language pair or the type or size of the corpus. In this paper we perform, in each of these experimental settings, a statistical analysis of the data and study the sample correlation coefficients between a number of alignment or phrase table characteristics and variables such as the phrase table size, the number of untranslated words or the BLEU score. We report results for two different SMT systems (a phrase-based and an n-gram-based system) on Chinese-to-English FBIS and BTEC data, and Spanish-to-English European Parliament data. We find that the alignment characteristics which help in translation greatly depend on the MT system and on the corpus size. We give alignment hints to improve BLEU score, depending on the SMT system used and the type of corpus. For example, for phrase-based SMT, dense alignments are required with larger corpora, especially on the target side, while with smaller corpora, more precise, sparser alignments are better, especially on the source side. Avoiding some long-distance crossing links may also improve BLEU score with small corpora. We take these conclusions into account to modify two types of alignment systems, and get 1 to 1.6 % relative improvements in BLEU score on two held-out corpora, although the improved system is different in each corpus.

Keywords

Statistical machine translation Word alignment Phrase extraction Discriminative training 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ayan NF, Dorr BJ (2006) Going beyond AER: an extensive analysis of word alignments and their impact on MT. In: Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the association for computational linguistics. Sydney, Australia, pp 9–16Google Scholar
  2. Brown PF, Della Pietra SA, Della Pietra VJ, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2): 263–311Google Scholar
  3. Chen B, Federico M (2006) Improving phrase-based statistical translation through combination of word alignment. In: Proceedings of FinTAL—5th international conference on natural language processing. Turku, Finland, pp 356–367Google Scholar
  4. Clark JH, Dyer C, Lavie A, Smith NA (2011) Better hypothesis testing for statistical machine translation: controlling for optimizer instability. In: Proceedings of the 49th annual meeting of the association for computational linguistics. Portland, Oregon, USA, pp 176–181Google Scholar
  5. Crego JM, Mariño JB (2007) Improving SMT by coupling reordering and decoding. Mach Trans 20(3): 199–215CrossRefGoogle Scholar
  6. DeNero J, Klein D (2007) Tailoring word alignments to syntactic machine translation. In: Proceedings of the 45th annual meeting of the association for computational linguistics. Prague, Czech Republic, pp 17–24Google Scholar
  7. Fraser A, Marcu D (2007) Measuring word alignment quality for statistical machine translation. Comput Linguist 33(3): 293–303MathSciNetMATHCrossRefGoogle Scholar
  8. Guzman F, Gao Q, Vogel S (2009) Reassessment of the role of phrase extraction in PBSMT. In: Proceedings of machine translation summit XII. Ottawa, Canada, pp 49–56Google Scholar
  9. Hollander M, Wolfe D (1973) Nonparametric statistical methods. Wiley, New YorkMATHGoogle Scholar
  10. Jolliffe IT (2002) Principal component analysis. Springer, New YorkMATHGoogle Scholar
  11. Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the human language technology conference of the NAACL. Edmonton, Canada, pp 48–54Google Scholar
  12. Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the association for computational linguistics (demo and poster sessions). Association for Computational Linguistics, Prague, Czech Republic, pp 177–180Google Scholar
  13. Lambert P, Banchs RE (2006) Tuning machine translation parameters with SPSA. In: Proceedings of the international workshop on spoken language translation, IWSLT’06. Kyoto, Japan, pp 190–196Google Scholar
  14. Lambert P, Banchs RE (2011) BIA: a discriminative phrase alignment toolkit. Prague Bulletin of Mathematical Linguistics 97Google Scholar
  15. Lambert P, de Gispert A, Banchs RE, Mariño JB (2005) Guidelines for word alignment evaluation and manual alignment. Lang Resour Eval 39(4): 267–285CrossRefGoogle Scholar
  16. Lambert P, Banchs RE, Crego JM (2007) Discriminative alignment training without annotated data for machine translation. In: Proceedings of the human language technology conference of the NAACL (short papers). Rochester, NY, USA, pp 85–88Google Scholar
  17. Lambert P, Ma Y, Ozdowska S, Way A (2009) Tracking relevant alignment characteristics for machine translation. In: Proceedings of machine translation summit XII. Ottawa, Canada, pp 268–275Google Scholar
  18. Liang P, Taskar B, Klein D (2006) Alignment by agreement. In: Proceedings of the human language technology conference of the NAACL. New York City, USA, pp 104–111Google Scholar
  19. Liu Y, Liu Q, Lin S (2010) Discriminative word alignment by linear modeling. Comput Linguist 36(3): 303–339CrossRefGoogle Scholar
  20. Mariño JB, Banchs RE, Crego JM, de Gispert A, Lambert P, Fonollosa JA, Costa-jussá MR (2006) N-gram based machine translation. Comput Linguist 32(4): 527–549MathSciNetMATHCrossRefGoogle Scholar
  21. Melamed ID (2000) Models of translational equivalence among words. Comput Linguist 26(2): 221–249CrossRefGoogle Scholar
  22. Moore RC (2005) A discriminative framework for bilingual word alignment. In: Proceedings of human language technology conference and conference on empirical methods in natural language processing. Vancouver, Canada, pp 81–88Google Scholar
  23. Näther W (2001) Random fuzzy variable of second order and applications to statistical inference. Inform Sci 133: 69–88MathSciNetMATHCrossRefGoogle Scholar
  24. Nelder J, Mead R (1965) A simplex method for function minimization. Comput J 7: 308–313MATHCrossRefGoogle Scholar
  25. Och F, Ney H (2004) The alignment template approach to statistical machine translation. Comput Linguist 30(4): 417–449MATHCrossRefGoogle Scholar
  26. Och FJ (2003) Minimum error rate training in statistical machine translation. In: Proceedings of the 41th annual meeting of the association for computational linguistics, pp 160–167Google Scholar
  27. Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1): 19–51MATHCrossRefGoogle Scholar
  28. Papineni K, Roukos S, Ward T, Zhu W-J (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics. Philadelphia, USA, pp 311–318Google Scholar
  29. Rodgers JL, Nicewander WA (1988) Thirteen ways to look at the correlation coefficient. Am Stat 42(1): 59–66CrossRefGoogle Scholar
  30. Spall JC (1992) Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans Automat Control 37: 332–341MathSciNetMATHCrossRefGoogle Scholar
  31. Spall JC (1998) An overview of the simultaneous perturbation method for efficient optimization. Johns Hopkins APL Techn Digest 19(4): 482–492Google Scholar
  32. Stephens MA (1974) EDF statistics for goodness of fit and some comparisons. J Am Stat Assoc 69: 730–737CrossRefGoogle Scholar
  33. Takezawa T, Sumita E, Sugaya F, Yamamoto H, Yamamoto S (2002) Toward a broad-coverage bilingual corpus for speech translation of travel conversations in the real world. In: Proceedings of third international conference on language resources and evaluation 2002. Las Palmas, Canary Islands, Spain, pp 147–152Google Scholar
  34. Vilar D, Popovic M, Ney H (2006) AER: do we need to “improve” our alignments? In: Proceedings of the international workshop on spoken language translation, IWSLT’06. Kyoto, Japan, pp 205–212Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2012

Authors and Affiliations

  • Patrik Lambert
    • 1
  • Simon Petitrenaud
    • 1
  • Yanjun Ma
    • 2
  • Andy Way
    • 3
  1. 1.LIUM, LUNAM UniversitéUniveristy of Le MansLe Mans Cedex 9France
  2. 2.Baidu Inc.BeijingChina
  3. 3.Applied Language SolutionsDelphUK

Personalised recommendations