Abstract
Statistical models for reordering source words have been used to enhance hierarchical phrase-based statistical machine translation. There are existing word-reordering models that learn reorderings for any two source words in a sentence or only for two contiguous words. This paper proposes a series of separate sub-models to learn reorderings for word pairs with different distances. Our experiments demonstrate that reordering sub-models for word pairs with distances less than a specific threshold are useful to improve translation quality. Compared with previous work, our method more effectively and efficiently exploits helpful word-reordering information; it improves a basic hierarchical phrase-based system by 2.4-3.1 BLEU points and keeps the average time of translating one sentence under 10 s.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
In translation experiments, we also tried adding a new penalty feature (how many source words in the input sentence are unaligned) to penalize unaligned words. However, this feature did not influence translation performance significantly.
Note that these scores are correspondingly calculated for different sub-models \(M_n\) and the sub-model weights are tuned separately.
In the original Hiero paper (Chiang 2005), only two nonterminals are allowed. However, it is not theoretically impossible to create rules with more than two nonterminals, hence our use of K here.
As we are using a cache, memory usage is a concern, but the size of the cache for each sentence is negligible compared to the size of the translation and language models, and thus the memory footprint is not increased significantly.
Note that “4” and “5” in the source and target sentences are original source and target words. This sentence pair is from a patent-translation corpus and there is a figure in the article, where the light source is labeled as 4 and the optical fiber is labeled as 5.
Cache was used in all experiments.
References
Bisazza A, Federico M (2013) Dynamically shaping the reordering search space of phrase-based statistical machine translation. Trans Assoc Comput Linguist 1:327–340
Cao H, Zhang D, Li M, Zhou M, Zhao T (2014) A lexicalized reordering model for hierarchical phrase-based translation. In: Coling 2014: proceedings of 25th international conference on computational linguistics. Dublin, pp 1144–1153
Chen S, Goodman J (1999) An empirical study of smoothing techniques for language modeling. Comput Speech Lang 4(13):359–393
Chiang D (2005) A hierarchical phrase-based model for statistical machine translation. ACL-05: 43rd annual meeting of the association for computational linguistics. Michigan, Ann Arbor, pp 263–270
Chiang D (2012) Hope and fear for discriminative training of statistical translation models. J Mach Learn Res 13(1):1159–1187
Cui L, Zhang D, Li M, Zhou M, Zhao T (2010) A joint rule selection model for hierarchical phrase-based translation. ACL 2010: the 48th annual meeting of the association for computational linguistics. Uppsala, pp 6–11
Feng M, Peter JT, Ney H (2013) Advancements in reordering models for statistical machine translation. Proceedings of the 51st annual meeting of the association for computational linguistics, vol 1, Long Papers, Sofia, pp 322–332
Gao Y, Koehn P, Birch A (2011) Soft dependency constraints for reordering in hierarchical phrase-based translation. In: Proceedings of EMNLP 2011, conference on empirical methods in natural language processing. Edinburgh, pp 857–868
Goto I, Lu B, Chow KP, Sumita E, Tsou BK (2011) Overview of the patent machine translation task at the NTCIR-9 workshop. In: Proceedings of the 9th NII test collection for IR systems workshop meeting. Tokyo, pp 559–578
Hayashi K, Tsukada H, Sudoh K, Duh K, Yamamoto S (2010) Hierarchical phrase-based machine translation with word-based reordering model. Coling 2010: proceedings of 23rd international conference on computational linguistics. Beijing, pp 439–446
He Z, Liu Q, Lin S (2008) Improving statistical machine translation using lexicalized rule selection. Coling 2008: proceedings of 22nd international conference on computational linguistics. Manchester, pp 321–328
Hopkins M, May J (2011) Tuning as ranking. In: Proceedings of EMNLP 2011, conference on empirical methods in natural language processing, Edinburgh, pp 1352–1362
Huck M, Wuebker J, Rietig F, Ney H (2013) A phrase orientation model for hierarchical machine translation. WMT 2013: Proceedings of 8th workshop on statistical machine translation. Sofia, pp 452–463
Kazemi A, Toral A, Way A, Monadjemi A, Nematbakhsh M (2015) Dependency-based reordering model for constituent pairs in hierarchical SMT. EAMT-2015: proceedings of the eighteenth annual conference of the european association for machine translation. Antalya, pp 43–50
Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Proceedings of the 2004 conference on empirical methods in natural language processing. Barcelona pp 388–395
Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. HLT-NAACL 2003: conference combining human language technology conference series and the North American chapter of the association for computational linguistics conference series. Edmonton, pp 48–54
Koehn P, Axelrod A, Birch A, Callison-Burch C, Osborne M, Talbot D, White M (2005) Edinburgh system description for the 2005 IWSLT speech translation evaluation. In: International workshop on spoken language translation: evaluation campaign on spoken language translation. Pittsburgh, pp 68–75
Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. The 45th annual meeting of the association for computational linguistics: demo and poster sessions. Prague, pp 177–180
Li P, Liu Y, Sun M, Izuha T, Zhang D (2014) A neural reordering model for phrase-based translation. Coling 2014: proceedings of 25th international conference on computational linguistics. Dublin, pp 1897–1907
Liu Q, He Z, Liu Y, Lin S (2008) Maximum entropy based rule selection model for syntax-based statistical machine translation. In: EMNLP 2008: proceedings of 2008 conference on empirical methods in natural language processing, Honolulu, pp 89–97
Marton Y, Resnik P (2008) Soft syntactic constraints for hierarchical phrased-based translation. In: ACL-08: HLT, proceedings of 46th annual meeting of the association for computational linguistics: human language technologies. Columbus, pp 1003–1011
Nguyen T, Vogel S (2013) Integrating phrase-based reordering features into a chart-based decoder for machine translation. In: ACL 2013, Proceedings of 51st annual meeting of the association for computational linguistics. Sofia, pp 1587–1596
Ni Y, Saunders C, Szedmak S, Niranjan M (2009) Handling phrase reorderings for machine translation. In: Proceedings of ACL-IJCNLP 2009, joint conference of the 47th annual meeting of the association for computational linguistics and 4th international joint conference on natural language processing of the AFNLP. Suntec, pp 241–244
Och FJ (2003) Minimum error rate training in statistical machine translation. In: Proceedings of 41st Annual meeting of the association for computational linguistics. Sapporo, pp 160–167
Och FJ, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of the conference on 40th annual meeting of the association for computational linguistics, Philadelphia, pp 295–302
Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51
Rumelhart D, Hinton G, Williams R (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536
Tromble R, Eisner J (2009) Learning linear ordering problems for better translation. In: EMNLP 2009, proceedings of the 2009 conference on empirical methods in natural language processing, Singapore, pp 1007–1016
Vaswani A, Zhao Y, Fossum V, Chiang D (2013) Decoding with large-scale neural language models improves translation. In: Proceedings of the 2013 Conference on empirical methods in natural language processing. Seattle, pp 1387–1392
Wang X, Xiong D, Zhang M (2015) Learning semantic representations for nonterminals in hierarchical phrase-based translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Lisbon, pp 1391–1400
Zens R, Ney H (2006) Discriminative reordering models for statistical machine translation. In: Proceedings of the workshop HLT-NAACL 06, statistical machine translation. New York City, pp 55–63
Zhang J, Utiyama M, Sumita E, Zhao H (2015) Learning word reorderings for hierarchical phrase-based statistical machine translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, vol 2. Short Papers. Beijing, pp 542–548
Zhao H, Huang CN, Li M (2006) An improved Chinese word segmentation system with conditional random field. In: Proceedings of the fifth SIGHAN workshop on chinese language processing. Sydney, pp 162–165
Acknowledgments
Hai Zhao was partially supported by the National Natural Science Foundation of China (Grant No. 61170114, and Grant No. 61272248), the National Basic Research Program of China (Grant No. 2013CB329401), the Science and Technology Commission of Shanghai Municipality (Grant No. 13511500200), the European Union Seventh Framework Program (Grant No. 247619), the Cai Yuanpei Program (CSC fund 201304490199, 201304490171), and the art and science interdisciplinary funds of Shanghai Jiao Tong University, No. 14X190040031, and the Key Project of National Society Science Foundation of China, No. 15-ZDA041.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Zhang, J., Utiyama, M., Sumita, E. et al. Learning local word reorderings for hierarchical phrase-based statistical machine translation. Machine Translation 30, 1–18 (2016). https://doi.org/10.1007/s10590-016-9178-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-016-9178-7