Machine Translation

, Volume 30, Issue 1–2, pp 1–18 | Cite as

Learning local word reorderings for hierarchical phrase-based statistical machine translation

Article

Abstract

Statistical models for reordering source words have been used to enhance hierarchical phrase-based statistical machine translation. There are existing word-reordering models that learn reorderings for any two source words in a sentence or only for two contiguous words. This paper proposes a series of separate sub-models to learn reorderings for word pairs with different distances. Our experiments demonstrate that reordering sub-models for word pairs with distances less than a specific threshold are useful to improve translation quality. Compared with previous work, our method more effectively and efficiently exploits helpful word-reordering information; it improves a basic hierarchical phrase-based system by 2.4-3.1 BLEU points and keeps the average time of translating one sentence under 10 s.

Keywords

Local word reorderings Reorderings for hierarchical phrase-based SMT Separate reordering sub-models 

References

  1. Bisazza A, Federico M (2013) Dynamically shaping the reordering search space of phrase-based statistical machine translation. Trans Assoc Comput Linguist 1:327–340Google Scholar
  2. Cao H, Zhang D, Li M, Zhou M, Zhao T (2014) A lexicalized reordering model for hierarchical phrase-based translation. In: Coling 2014: proceedings of 25th international conference on computational linguistics. Dublin, pp 1144–1153Google Scholar
  3. Chen S, Goodman J (1999) An empirical study of smoothing techniques for language modeling. Comput Speech Lang 4(13):359–393CrossRefGoogle Scholar
  4. Chiang D (2005) A hierarchical phrase-based model for statistical machine translation. ACL-05: 43rd annual meeting of the association for computational linguistics. Michigan, Ann Arbor, pp 263–270Google Scholar
  5. Chiang D (2012) Hope and fear for discriminative training of statistical translation models. J Mach Learn Res 13(1):1159–1187MathSciNetMATHGoogle Scholar
  6. Cui L, Zhang D, Li M, Zhou M, Zhao T (2010) A joint rule selection model for hierarchical phrase-based translation. ACL 2010: the 48th annual meeting of the association for computational linguistics. Uppsala, pp 6–11Google Scholar
  7. Feng M, Peter JT, Ney H (2013) Advancements in reordering models for statistical machine translation. Proceedings of the 51st annual meeting of the association for computational linguistics, vol 1, Long Papers, Sofia, pp 322–332Google Scholar
  8. Gao Y, Koehn P, Birch A (2011) Soft dependency constraints for reordering in hierarchical phrase-based translation. In: Proceedings of EMNLP 2011, conference on empirical methods in natural language processing. Edinburgh, pp 857–868Google Scholar
  9. Goto I, Lu B, Chow KP, Sumita E, Tsou BK (2011) Overview of the patent machine translation task at the NTCIR-9 workshop. In: Proceedings of the 9th NII test collection for IR systems workshop meeting. Tokyo, pp 559–578Google Scholar
  10. Hayashi K, Tsukada H, Sudoh K, Duh K, Yamamoto S (2010) Hierarchical phrase-based machine translation with word-based reordering model. Coling 2010: proceedings of 23rd international conference on computational linguistics. Beijing, pp 439–446Google Scholar
  11. He Z, Liu Q, Lin S (2008) Improving statistical machine translation using lexicalized rule selection. Coling 2008: proceedings of 22nd international conference on computational linguistics. Manchester, pp 321–328Google Scholar
  12. Hopkins M, May J (2011) Tuning as ranking. In: Proceedings of EMNLP 2011, conference on empirical methods in natural language processing, Edinburgh, pp 1352–1362Google Scholar
  13. Huck M, Wuebker J, Rietig F, Ney H (2013) A phrase orientation model for hierarchical machine translation. WMT 2013: Proceedings of 8th workshop on statistical machine translation. Sofia, pp 452–463Google Scholar
  14. Kazemi A, Toral A, Way A, Monadjemi A, Nematbakhsh M (2015) Dependency-based reordering model for constituent pairs in hierarchical SMT. EAMT-2015: proceedings of the eighteenth annual conference of the european association for machine translation. Antalya, pp 43–50Google Scholar
  15. Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Proceedings of the 2004 conference on empirical methods in natural language processing. Barcelona pp 388–395Google Scholar
  16. Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. HLT-NAACL 2003: conference combining human language technology conference series and the North American chapter of the association for computational linguistics conference series. Edmonton, pp 48–54Google Scholar
  17. Koehn P, Axelrod A, Birch A, Callison-Burch C, Osborne M, Talbot D, White M (2005) Edinburgh system description for the 2005 IWSLT speech translation evaluation. In: International workshop on spoken language translation: evaluation campaign on spoken language translation. Pittsburgh, pp 68–75Google Scholar
  18. Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. The 45th annual meeting of the association for computational linguistics: demo and poster sessions. Prague, pp 177–180Google Scholar
  19. Li P, Liu Y, Sun M, Izuha T, Zhang D (2014) A neural reordering model for phrase-based translation. Coling 2014: proceedings of 25th international conference on computational linguistics. Dublin, pp 1897–1907Google Scholar
  20. Liu Q, He Z, Liu Y, Lin S (2008) Maximum entropy based rule selection model for syntax-based statistical machine translation. In: EMNLP 2008: proceedings of 2008 conference on empirical methods in natural language processing, Honolulu, pp 89–97Google Scholar
  21. Marton Y, Resnik P (2008) Soft syntactic constraints for hierarchical phrased-based translation. In: ACL-08: HLT, proceedings of 46th annual meeting of the association for computational linguistics: human language technologies. Columbus, pp 1003–1011Google Scholar
  22. Nguyen T, Vogel S (2013) Integrating phrase-based reordering features into a chart-based decoder for machine translation. In: ACL 2013, Proceedings of 51st annual meeting of the association for computational linguistics. Sofia, pp 1587–1596Google Scholar
  23. Ni Y, Saunders C, Szedmak S, Niranjan M (2009) Handling phrase reorderings for machine translation. In: Proceedings of ACL-IJCNLP 2009, joint conference of the 47th annual meeting of the association for computational linguistics and 4th international joint conference on natural language processing of the AFNLP. Suntec, pp 241–244Google Scholar
  24. Och FJ (2003) Minimum error rate training in statistical machine translation. In: Proceedings of 41st Annual meeting of the association for computational linguistics. Sapporo, pp 160–167Google Scholar
  25. Och FJ, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of the conference on 40th annual meeting of the association for computational linguistics, Philadelphia, pp 295–302Google Scholar
  26. Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51CrossRefMATHGoogle Scholar
  27. Rumelhart D, Hinton G, Williams R (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536CrossRefGoogle Scholar
  28. Tromble R, Eisner J (2009) Learning linear ordering problems for better translation. In: EMNLP 2009, proceedings of the 2009 conference on empirical methods in natural language processing, Singapore, pp 1007–1016Google Scholar
  29. Vaswani A, Zhao Y, Fossum V, Chiang D (2013) Decoding with large-scale neural language models improves translation. In: Proceedings of the 2013 Conference on empirical methods in natural language processing. Seattle, pp 1387–1392Google Scholar
  30. Wang X, Xiong D, Zhang M (2015) Learning semantic representations for nonterminals in hierarchical phrase-based translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Lisbon, pp 1391–1400Google Scholar
  31. Zens R, Ney H (2006) Discriminative reordering models for statistical machine translation. In: Proceedings of the workshop HLT-NAACL 06, statistical machine translation. New York City, pp 55–63Google Scholar
  32. Zhang J, Utiyama M, Sumita E, Zhao H (2015) Learning word reorderings for hierarchical phrase-based statistical machine translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, vol 2. Short Papers. Beijing, pp 542–548Google Scholar
  33. Zhao H, Huang CN, Li M (2006) An improved Chinese word segmentation system with conditional random field. In: Proceedings of the fifth SIGHAN workshop on chinese language processing. Sydney, pp 162–165Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  1. 1.National Institute of Information and Communications TechnologyKeihanna Science CityJapan
  2. 2.Graduate School of Information ScienceNara Institute of Science and TechnologyIkomaJapan
  3. 3.Department of Computer Science and Engineering, Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive EngineeringShanghai Jiao Tong UniversityShanghaiChina

Personalised recommendations