Improving Low-Resource Neural Machine Translation with Weight Sharing

Feng, Tao; Li, Miao; Liu, Xiaojun; Cao, Yichao

doi:10.1007/978-3-030-01716-3_6

Tao Feng^18,19,
Miao Li¹⁸,
Xiaojun Liu²⁰ &
…
Yichao Cao^18,19

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11221))

Included in the following conference series:

1411 Accesses

Abstract

Neural machine translation (NMT) has achieved great success under a great deal of bilingual corpora in the past few years. However, it is much less effective for low-resource language. In order to alleviate the problem, we present two approaches which can improve the performance of low-resource NMT system. The first approach employs the weight sharing of decoder to enhance the target language model of low-resource NMT system. The second approach applies cross-lingual embedding and source sentence representation space sharing to strengthen the encoder of low-resource NMT. Our experiments demonstrate that the proposed method can obtain significant improvements on low-resource neural machine translation than baseline system. On the IWSLT2015 Vietnamese-English translation task, our model can improve the translation quality by an average of 1.43 BLEU scores. Besides, we can also get the increase of 0.96 BLEU scores when translating from Mongolian to Chinese.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Wu, Y., Schuster, M., Chen, Z., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)
Gehring, J., Auli, M., Grangier, D., et al.: Convolutional sequence to sequence learning. arXiv preprint arXiv:1705.03122 (2017)
Junczys-Dowmunt, M., Dwojak, T., Hoang, H.: Is neural machine translation ready for deployment? A case study on 30 translation directions. arXiv preprint arXiv:1610.01108 (2016)
Bojar, O., Chatterjee, R., Federmann, C., et al.: Findings of the 2016 conference on machine translation. In: ACL 2016 First Conference On Machine Translation (WMT16), pp. 131–198. The Association for Computational Linguistics (2016)
Google Scholar
Kalchbrenner, N., Blunsom, P.: Recurrent continuous translation models. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1700–1709 (2013)
Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015)
Koehn, P., Knowles, R.: Six challenges for neural machine translation. arXiv preprint arXiv:1706.03872 (2017)
Zoph, B., Yuret, D., May, J., et al.: Transfer learning for low-resource neural machine translation. arXiv preprint arXiv:1604.02201 (2016)
Artetxe, M., Labaka, G., Agirre, E., et al.: Unsupervised neural machine translation. arXiv preprint arXiv:1710.11041 (2017)
Johnson, M., Schuster, M., Le, Q.V., et al.: Google’s multilingual neural machine translation system: enabling zero-shot translation. arXiv preprint arXiv:1611.04558 (2016)
Artetxe, M., Labaka, G., Agirre, E.: Learning bilingual word embeddings with (almost) no bilingual data. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 451–462 (2017)
Google Scholar
Nguyen, T.Q., Chiang, D.: Transfer learning across low-resource, related languages for neural machine translation. arXiv preprint arXiv:1708.09803 (2017)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Gal, Y., Ghahramani, Z.A.: Theoretically grounded application of dropout in recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 1019–1027 (2016)
Google Scholar
Sennrich, R., Haddow, B., Birch, A.: Improving neural machine translation models with monolingual data. arXiv preprint arXiv:1511.06709 (2015)
Currey, A., Barone, A.V.M., Heafield, K.: Copied monolingual data improves low-resource neural machine translation. In: Proceedings of the Second Conference on Machine Translation, pp. 148–156 (2017)
Google Scholar
Zhang, J., Zong, C.: Exploiting source-side monolingual data in neural machine translation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1535–1545 (2016)
Google Scholar
Chen, Y., Liu, Y., Cheng, Y., et al.: A teacher-student framework for zero-resource neural machine translation. arXiv preprint arXiv:1705.00753 (2017)
Zheng, H., Cheng, Y., Liu, Y.: Maximum expected likelihood estimation for zero-resource neural machine translation. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI 2017), Melbourne, Australia, pp. 4251–4257 (2017)
Google Scholar
Cheng, Y., Xu, W., He, Z., et al.: Semi-supervised learning for neural machine translation. arXiv preprint arXiv:1606.04596 (2016)
Gulcehre, C., Firat, O., Xu, K., et al.: On using monolingual corpora in neural machine translation. arXiv preprint arXiv:1503.03535 (2015)
Firat, O., Sankaran, B., Al-Onaizan, Y., et al.: Zero-resource translation with multi-lingual neural machine translation. arXiv preprint arXiv:1606.04164 (2016)
Lample, G., Denoyer, L., Ranzato, M.A.: Unsupervised machine translation using monolingual corpora only. arXiv preprint arXiv:1711.00043 (2017)
Dong, D., Wu, H., He, W., et al.: Multi-task learning for multiple language translation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), vol. 1, pp. 1723–1732 (2015)
Google Scholar

Download references

Acknowledgements

The work is supported by the Nation Natural Science Foundation of China under No. 61572462, 61502445.

Author information

Authors and Affiliations

Institute of Intelligent Machines, Chinese Academy of Science, Hefei, China
Tao Feng, Miao Li & Yichao Cao
University of Science and Technology of China, Hefei, China
Tao Feng & Yichao Cao
School of Information and Computer, Anhui Agricultural University, Hefei, China
Xiaojun Liu

Authors

Tao Feng
View author publications
You can also search for this author in PubMed Google Scholar
Miao Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yichao Cao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Miao Li .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Maosong Sun
Harbin Institute of Technology, Harbin, China
Ting Liu
Beijing University of Posts and Telecommunications, Beijing, China
Xiaojie Wang
Tsinghua University, Beijing, China
Zhiyuan Liu
Tsinghua University, Beijing, China
Yang Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Feng, T., Li, M., Liu, X., Cao, Y. (2018). Improving Low-Resource Neural Machine Translation with Weight Sharing. In: Sun, M., Liu, T., Wang, X., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. CCL NLP-NABD 2018 2018. Lecture Notes in Computer Science(), vol 11221. Springer, Cham. https://doi.org/10.1007/978-3-030-01716-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-01716-3_6
Published: 07 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01715-6
Online ISBN: 978-3-030-01716-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics