CWMT 2017: Machine Translation pp 30-42 | Cite as

Word, Subword or Character? An Empirical Study of Granularity in Chinese-English NMT

  • Yining Wang
  • Long Zhou
  • Jiajun Zhang
  • Chengqing Zong
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 787)

Abstract

Neural machine translation (NMT) becomes a new approach to machine translation and is proved to outperform conventional statistical machine translation (SMT) across a variety of language pairs. Most existing NMT systems operate with a fixed vocabulary, but translation is an open-vocabulary problem. Hence, previous works mainly handle rare and unknown words by using different translation granularities, such as character, subword, and hybrid word-character. While translation involving Chinese has been proved to be one of the most difficult tasks, there is no study to demonstrate which translation granularity is the most suitable for Chinese in NMT. In this paper, we conduct an extensive comparison using Chinese-English NMT as a case study. Furthermore, we discuss the advantages and disadvantages of various translation granularities in detail. Our experiments show that subword model performs best for Chinese-to-English translation while hybrid word-character model is most suitable for English-to-Chinese translation.

Keywords

Neural machine translation Translation granularity Subword model Character model 

Notes

Acknowledgments

The research work has been funded by the Natural Science Foundation of China under Grant Nos. 61333018 and 61402478, and it is also supported by the Strategic Priority Research Program of the CAS under Grant No. XDB02070007.

References

  1. 1.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Proceedings of ICLR 2015 (2015)Google Scholar
  2. 2.
    Cheng, Y., Liu, Y., Yang, Q., Sun, M., Xu, W.: Joint training for pivot-based neural machine translation. arXiv preprint arXiv:1611.04928v2 (2017)
  3. 3.
    Chiang, D.: A hierarchical phrase-based model for statistical machine translation. In: Proceedings of ACL 2005 (2005)Google Scholar
  4. 4.
    Chung, J., Cho, K., Bengio, Y.: A character-level decoder without explicit segmentation for neural machine translation (2016)Google Scholar
  5. 5.
    Gage, P.: A New Algorithm for Data Compression. R & D Publications, Inc., Lawrence (1994)Google Scholar
  6. 6.
    He, W., He, Z., Wu, H., Wang, H.: Improved neural machine translation with SMT features. In: Proceedings of AAAI 2016 (2016)Google Scholar
  7. 7.
    Jean, S., Cho, K., Memisevic, R., Bengio, Y.: On using very large target vocabulary for neural machine translation. Computer Science (2014)Google Scholar
  8. 8.
    Junczys-Dowmunt, M., Dwojak, T., Hoang, H.: Is neural machine translation ready for deployment? A case study on 30 translation directions. In: Proceedings of IWSLT 2016 (2016)Google Scholar
  9. 9.
    Kalchbrenner, N., Blunsom, P.: Recurrent continuous translation models. In: Proceedings of EMNLP 2013 (2013)Google Scholar
  10. 10.
    Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. Computer Science (2014)Google Scholar
  11. 11.
    Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: Proceedings of ACL-NAACL 2013 (2003)Google Scholar
  12. 12.
    Li, X., Zhang, J., Zong, C.: Towards zero unknown word in neural machine translation. In: Proceedings of IJCAI 2016 (2016)Google Scholar
  13. 13.
    Luong, M.T., Manning, C.D.: Achieving open vocabulary neural machine translation with hybrid word-character models (2016)Google Scholar
  14. 14.
    Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Proceedings of EMNLP 2015 (2015)Google Scholar
  15. 15.
    Luong, M.T., Sutskever, I., Le, Q.V., Vinyals, O., Zaremba, W.: Addressing the rare word problem in neural machine translation. In: Proceedings of ACL 2015 (2015)Google Scholar
  16. 16.
    Meng, F., Lu, Z., Li, H., Liu, Q.: Interactive attention for neural machine translation. In: Proceedings of COLING 2016 (2016)Google Scholar
  17. 17.
    Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: a method for automatic evaluation of machine translation. In Proceedings of ACL 2002 (2002)Google Scholar
  18. 18.
    Schuster, M., Nakajima, K.: Japanese and Korean voice search, vol. 22, no. 10, pp. 5149–5152 (2012)Google Scholar
  19. 19.
    Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of ACL 2016 (2016)Google Scholar
  20. 20.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of NIPS 2014 (2014)Google Scholar
  21. 21.
    Wang, X., Lu, Z., Tu, Z., Li, H., Xiong, D., Zhang, M.: Neural machine translation advised by statistical machine translation. In: Proceedings of AAAI 2017 (2017)Google Scholar
  22. 22.
    Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Mohammad Norouzi, et al.: Googles neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)
  23. 23.
    Zhai, F., Zhang, J., Zhou, Y., Zong, C., et al.: Tree-based translation without using parse trees. In: Proceedings of COLING 2012 (2012)Google Scholar
  24. 24.
    Zhang, J., Zong, C.: Bridging neural machine translation and bilingual dictionaries. arXiv preprint arXiv:1610.07272 (2016)
  25. 25.
    Zhou, L., Hu, W., Zhang, J., Zong, C.: Neural system combination for machine translation. arXiv preprint arXiv:1704.06393 (2017)

Copyright information

© Springer Nature Singapore Pte Ltd. 2017

Authors and Affiliations

  • Yining Wang
    • 1
  • Long Zhou
    • 1
  • Jiajun Zhang
    • 1
  • Chengqing Zong
    • 1
    • 2
  1. 1.National Laboratory of Pattern Recognition, CASIAUniversity of Chinese Academy of SciencesBeijingChina
  2. 2.CAS Center for Excellence in Brain Science and Intelligence TechnologyBeijingChina

Personalised recommendations