Learning local word reorderings for hierarchical phrase-based statistical machine translation

Zhang, Jingyi; Utiyama, Masao; Sumita, Eiichro; Zhao, Hai; Neubig, Graham; Nakamura, Satoshi

doi:10.1007/s10590-016-9178-7

Learning local word reorderings for hierarchical phrase-based statistical machine translation

Published: 12 March 2016

Volume 30, pages 1–18, (2016)
Cite this article

Machine Translation

Jingyi Zhang^1,2,
Masao Utiyama¹,
Eiichro Sumita¹,
Hai Zhao³,
Graham Neubig² &
…
Satoshi Nakamura²

580 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Statistical models for reordering source words have been used to enhance hierarchical phrase-based statistical machine translation. There are existing word-reordering models that learn reorderings for any two source words in a sentence or only for two contiguous words. This paper proposes a series of separate sub-models to learn reorderings for word pairs with different distances. Our experiments demonstrate that reordering sub-models for word pairs with distances less than a specific threshold are useful to improve translation quality. Compared with previous work, our method more effectively and efficiently exploits helpful word-reordering information; it improves a basic hierarchical phrase-based system by 2.4-3.1 BLEU points and keeps the average time of translating one sentence under 10 s.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving Reordering Models with Phrase Number Feature for Statistical Machine Translation

Labeling hierarchical phrase-based models without linguistic resources

Article Open access 01 December 2015

Gideon Maillette de Buy Wenniger & Khalil Sima’an

A Content-Based Neural Reordering Model for Statistical Machine Translation

Notes

In translation experiments, we also tried adding a new penalty feature (how many source words in the input sentence are unaligned) to penalize unaligned words. However, this feature did not influence translation performance significantly.
Note that these scores are correspondingly calculated for different sub-models \(M_n\) and the sub-model weights are tuned separately.
In the original Hiero paper (Chiang 2005), only two nonterminals are allowed. However, it is not theoretically impossible to create rules with more than two nonterminals, hence our use of K here.
As we are using a cache, memory usage is a concern, but the size of the cache for each sentence is negligible compared to the size of the translation and language models, and thus the memory footprint is not increased significantly.
http://sourceforge.net/projects/mecab/files/.
http://hlt.fbk.eu/en/irstlm.
Note that “4” and “5” in the source and target sentences are original source and target words. This sentence pair is from a patent-translation corpus and there is a figure in the article, where the light source is labeled as 4 and the optical fiber is labeled as 5.
Cache was used in all experiments.

References

Bisazza A, Federico M (2013) Dynamically shaping the reordering search space of phrase-based statistical machine translation. Trans Assoc Comput Linguist 1:327–340
Google Scholar
Cao H, Zhang D, Li M, Zhou M, Zhao T (2014) A lexicalized reordering model for hierarchical phrase-based translation. In: Coling 2014: proceedings of 25th international conference on computational linguistics. Dublin, pp 1144–1153
Chen S, Goodman J (1999) An empirical study of smoothing techniques for language modeling. Comput Speech Lang 4(13):359–393
Article Google Scholar
Chiang D (2005) A hierarchical phrase-based model for statistical machine translation. ACL-05: 43rd annual meeting of the association for computational linguistics. Michigan, Ann Arbor, pp 263–270
Chiang D (2012) Hope and fear for discriminative training of statistical translation models. J Mach Learn Res 13(1):1159–1187
MathSciNet MATH Google Scholar
Cui L, Zhang D, Li M, Zhou M, Zhao T (2010) A joint rule selection model for hierarchical phrase-based translation. ACL 2010: the 48th annual meeting of the association for computational linguistics. Uppsala, pp 6–11
Feng M, Peter JT, Ney H (2013) Advancements in reordering models for statistical machine translation. Proceedings of the 51st annual meeting of the association for computational linguistics, vol 1, Long Papers, Sofia, pp 322–332
Gao Y, Koehn P, Birch A (2011) Soft dependency constraints for reordering in hierarchical phrase-based translation. In: Proceedings of EMNLP 2011, conference on empirical methods in natural language processing. Edinburgh, pp 857–868
Goto I, Lu B, Chow KP, Sumita E, Tsou BK (2011) Overview of the patent machine translation task at the NTCIR-9 workshop. In: Proceedings of the 9th NII test collection for IR systems workshop meeting. Tokyo, pp 559–578
Hayashi K, Tsukada H, Sudoh K, Duh K, Yamamoto S (2010) Hierarchical phrase-based machine translation with word-based reordering model. Coling 2010: proceedings of 23rd international conference on computational linguistics. Beijing, pp 439–446
He Z, Liu Q, Lin S (2008) Improving statistical machine translation using lexicalized rule selection. Coling 2008: proceedings of 22nd international conference on computational linguistics. Manchester, pp 321–328
Hopkins M, May J (2011) Tuning as ranking. In: Proceedings of EMNLP 2011, conference on empirical methods in natural language processing, Edinburgh, pp 1352–1362
Huck M, Wuebker J, Rietig F, Ney H (2013) A phrase orientation model for hierarchical machine translation. WMT 2013: Proceedings of 8th workshop on statistical machine translation. Sofia, pp 452–463
Kazemi A, Toral A, Way A, Monadjemi A, Nematbakhsh M (2015) Dependency-based reordering model for constituent pairs in hierarchical SMT. EAMT-2015: proceedings of the eighteenth annual conference of the european association for machine translation. Antalya, pp 43–50
Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Proceedings of the 2004 conference on empirical methods in natural language processing. Barcelona pp 388–395
Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. HLT-NAACL 2003: conference combining human language technology conference series and the North American chapter of the association for computational linguistics conference series. Edmonton, pp 48–54
Koehn P, Axelrod A, Birch A, Callison-Burch C, Osborne M, Talbot D, White M (2005) Edinburgh system description for the 2005 IWSLT speech translation evaluation. In: International workshop on spoken language translation: evaluation campaign on spoken language translation. Pittsburgh, pp 68–75
Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. The 45th annual meeting of the association for computational linguistics: demo and poster sessions. Prague, pp 177–180
Li P, Liu Y, Sun M, Izuha T, Zhang D (2014) A neural reordering model for phrase-based translation. Coling 2014: proceedings of 25th international conference on computational linguistics. Dublin, pp 1897–1907
Liu Q, He Z, Liu Y, Lin S (2008) Maximum entropy based rule selection model for syntax-based statistical machine translation. In: EMNLP 2008: proceedings of 2008 conference on empirical methods in natural language processing, Honolulu, pp 89–97
Marton Y, Resnik P (2008) Soft syntactic constraints for hierarchical phrased-based translation. In: ACL-08: HLT, proceedings of 46th annual meeting of the association for computational linguistics: human language technologies. Columbus, pp 1003–1011
Nguyen T, Vogel S (2013) Integrating phrase-based reordering features into a chart-based decoder for machine translation. In: ACL 2013, Proceedings of 51st annual meeting of the association for computational linguistics. Sofia, pp 1587–1596
Ni Y, Saunders C, Szedmak S, Niranjan M (2009) Handling phrase reorderings for machine translation. In: Proceedings of ACL-IJCNLP 2009, joint conference of the 47th annual meeting of the association for computational linguistics and 4th international joint conference on natural language processing of the AFNLP. Suntec, pp 241–244
Och FJ (2003) Minimum error rate training in statistical machine translation. In: Proceedings of 41st Annual meeting of the association for computational linguistics. Sapporo, pp 160–167
Och FJ, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of the conference on 40th annual meeting of the association for computational linguistics, Philadelphia, pp 295–302
Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51
Article MATH Google Scholar
Rumelhart D, Hinton G, Williams R (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536
Article Google Scholar
Tromble R, Eisner J (2009) Learning linear ordering problems for better translation. In: EMNLP 2009, proceedings of the 2009 conference on empirical methods in natural language processing, Singapore, pp 1007–1016
Vaswani A, Zhao Y, Fossum V, Chiang D (2013) Decoding with large-scale neural language models improves translation. In: Proceedings of the 2013 Conference on empirical methods in natural language processing. Seattle, pp 1387–1392
Wang X, Xiong D, Zhang M (2015) Learning semantic representations for nonterminals in hierarchical phrase-based translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Lisbon, pp 1391–1400
Zens R, Ney H (2006) Discriminative reordering models for statistical machine translation. In: Proceedings of the workshop HLT-NAACL 06, statistical machine translation. New York City, pp 55–63
Zhang J, Utiyama M, Sumita E, Zhao H (2015) Learning word reorderings for hierarchical phrase-based statistical machine translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, vol 2. Short Papers. Beijing, pp 542–548
Zhao H, Huang CN, Li M (2006) An improved Chinese word segmentation system with conditional random field. In: Proceedings of the fifth SIGHAN workshop on chinese language processing. Sydney, pp 162–165

Download references

Acknowledgments

Hai Zhao was partially supported by the National Natural Science Foundation of China (Grant No. 61170114, and Grant No. 61272248), the National Basic Research Program of China (Grant No. 2013CB329401), the Science and Technology Commission of Shanghai Municipality (Grant No. 13511500200), the European Union Seventh Framework Program (Grant No. 247619), the Cai Yuanpei Program (CSC fund 201304490199, 201304490171), and the art and science interdisciplinary funds of Shanghai Jiao Tong University, No. 14X190040031, and the Key Project of National Society Science Foundation of China, No. 15-ZDA041.

Author information

Authors and Affiliations

National Institute of Information and Communications Technology, 3-5 Hikaridai, Keihanna Science City, Kyoto, 619-0289, Japan
Jingyi Zhang, Masao Utiyama & Eiichro Sumita
Graduate School of Information Science, Nara Institute of Science and Technology, Takayama, Ikoma, Nara, 630-0192, Japan
Jingyi Zhang, Graham Neubig & Satoshi Nakamura
Department of Computer Science and Engineering, Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
Hai Zhao

Authors

Jingyi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Masao Utiyama
View author publications
You can also search for this author in PubMed Google Scholar
Eiichro Sumita
View author publications
You can also search for this author in PubMed Google Scholar
Hai Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Graham Neubig
View author publications
You can also search for this author in PubMed Google Scholar
Satoshi Nakamura
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jingyi Zhang, Masao Utiyama or Hai Zhao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, J., Utiyama, M., Sumita, E. et al. Learning local word reorderings for hierarchical phrase-based statistical machine translation. Machine Translation 30, 1–18 (2016). https://doi.org/10.1007/s10590-016-9178-7

Download citation

Received: 18 September 2015
Accepted: 24 February 2016
Published: 12 March 2016
Issue Date: June 2016
DOI: https://doi.org/10.1007/s10590-016-9178-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning local word reorderings for hierarchical phrase-based statistical machine translation

Abstract

Access this article

Similar content being viewed by others

Improving Reordering Models with Phrase Number Feature for Statistical Machine Translation

Labeling hierarchical phrase-based models without linguistic resources

A Content-Based Neural Reordering Model for Statistical Machine Translation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning local word reorderings for hierarchical phrase-based statistical machine translation

Abstract

Access this article

Similar content being viewed by others

Improving Reordering Models with Phrase Number Feature for Statistical Machine Translation

Labeling hierarchical phrase-based models without linguistic resources

A Content-Based Neural Reordering Model for Statistical Machine Translation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation