Skip to main content

Improving Low-Resource Neural Machine Translation with Weight Sharing

  • Conference paper
  • First Online:
Book cover Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data (CCL 2018, NLP-NABD 2018)

Abstract

Neural machine translation (NMT) has achieved great success under a great deal of bilingual corpora in the past few years. However, it is much less effective for low-resource language. In order to alleviate the problem, we present two approaches which can improve the performance of low-resource NMT system. The first approach employs the weight sharing of decoder to enhance the target language model of low-resource NMT system. The second approach applies cross-lingual embedding and source sentence representation space sharing to strengthen the encoder of low-resource NMT. Our experiments demonstrate that the proposed method can obtain significant improvements on low-resource neural machine translation than baseline system. On the IWSLT2015 Vietnamese-English translation task, our model can improve the translation quality by an average of 1.43 BLEU scores. Besides, we can also get the increase of 0.96 BLEU scores when translating from Mongolian to Chinese.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wu, Y., Schuster, M., Chen, Z., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)

  2. Gehring, J., Auli, M., Grangier, D., et al.: Convolutional sequence to sequence learning. arXiv preprint arXiv:1705.03122 (2017)

  3. Junczys-Dowmunt, M., Dwojak, T., Hoang, H.: Is neural machine translation ready for deployment? A case study on 30 translation directions. arXiv preprint arXiv:1610.01108 (2016)

  4. Bojar, O., Chatterjee, R., Federmann, C., et al.: Findings of the 2016 conference on machine translation. In: ACL 2016 First Conference On Machine Translation (WMT16), pp. 131–198. The Association for Computational Linguistics (2016)

    Google Scholar 

  5. Kalchbrenner, N., Blunsom, P.: Recurrent continuous translation models. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1700–1709 (2013)

    Google Scholar 

  6. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)

    Google Scholar 

  7. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  8. Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015)

  9. Koehn, P., Knowles, R.: Six challenges for neural machine translation. arXiv preprint arXiv:1706.03872 (2017)

  10. Zoph, B., Yuret, D., May, J., et al.: Transfer learning for low-resource neural machine translation. arXiv preprint arXiv:1604.02201 (2016)

  11. Artetxe, M., Labaka, G., Agirre, E., et al.: Unsupervised neural machine translation. arXiv preprint arXiv:1710.11041 (2017)

  12. Johnson, M., Schuster, M., Le, Q.V., et al.: Google’s multilingual neural machine translation system: enabling zero-shot translation. arXiv preprint arXiv:1611.04558 (2016)

  13. Artetxe, M., Labaka, G., Agirre, E.: Learning bilingual word embeddings with (almost) no bilingual data. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 451–462 (2017)

    Google Scholar 

  14. Nguyen, T.Q., Chiang, D.: Transfer learning across low-resource, related languages for neural machine translation. arXiv preprint arXiv:1708.09803 (2017)

  15. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  16. Gal, Y., Ghahramani, Z.A.: Theoretically grounded application of dropout in recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 1019–1027 (2016)

    Google Scholar 

  17. Sennrich, R., Haddow, B., Birch, A.: Improving neural machine translation models with monolingual data. arXiv preprint arXiv:1511.06709 (2015)

  18. Currey, A., Barone, A.V.M., Heafield, K.: Copied monolingual data improves low-resource neural machine translation. In: Proceedings of the Second Conference on Machine Translation, pp. 148–156 (2017)

    Google Scholar 

  19. Zhang, J., Zong, C.: Exploiting source-side monolingual data in neural machine translation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1535–1545 (2016)

    Google Scholar 

  20. Chen, Y., Liu, Y., Cheng, Y., et al.: A teacher-student framework for zero-resource neural machine translation. arXiv preprint arXiv:1705.00753 (2017)

  21. Zheng, H., Cheng, Y., Liu, Y.: Maximum expected likelihood estimation for zero-resource neural machine translation. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI 2017), Melbourne, Australia, pp. 4251–4257 (2017)

    Google Scholar 

  22. Cheng, Y., Xu, W., He, Z., et al.: Semi-supervised learning for neural machine translation. arXiv preprint arXiv:1606.04596 (2016)

  23. Gulcehre, C., Firat, O., Xu, K., et al.: On using monolingual corpora in neural machine translation. arXiv preprint arXiv:1503.03535 (2015)

  24. Firat, O., Sankaran, B., Al-Onaizan, Y., et al.: Zero-resource translation with multi-lingual neural machine translation. arXiv preprint arXiv:1606.04164 (2016)

  25. Lample, G., Denoyer, L., Ranzato, M.A.: Unsupervised machine translation using monolingual corpora only. arXiv preprint arXiv:1711.00043 (2017)

  26. Dong, D., Wu, H., He, W., et al.: Multi-task learning for multiple language translation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), vol. 1, pp. 1723–1732 (2015)

    Google Scholar 

Download references

Acknowledgements

The work is supported by the Nation Natural Science Foundation of China under No. 61572462, 61502445.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miao Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Feng, T., Li, M., Liu, X., Cao, Y. (2018). Improving Low-Resource Neural Machine Translation with Weight Sharing. In: Sun, M., Liu, T., Wang, X., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. CCL NLP-NABD 2018 2018. Lecture Notes in Computer Science(), vol 11221. Springer, Cham. https://doi.org/10.1007/978-3-030-01716-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01716-3_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01715-6

  • Online ISBN: 978-3-030-01716-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics