CWMT 2017: Machine Translation pp 30-42 | Cite as
Word, Subword or Character? An Empirical Study of Granularity in Chinese-English NMT
Abstract
Neural machine translation (NMT) becomes a new approach to machine translation and is proved to outperform conventional statistical machine translation (SMT) across a variety of language pairs. Most existing NMT systems operate with a fixed vocabulary, but translation is an open-vocabulary problem. Hence, previous works mainly handle rare and unknown words by using different translation granularities, such as character, subword, and hybrid word-character. While translation involving Chinese has been proved to be one of the most difficult tasks, there is no study to demonstrate which translation granularity is the most suitable for Chinese in NMT. In this paper, we conduct an extensive comparison using Chinese-English NMT as a case study. Furthermore, we discuss the advantages and disadvantages of various translation granularities in detail. Our experiments show that subword model performs best for Chinese-to-English translation while hybrid word-character model is most suitable for English-to-Chinese translation.
Keywords
Neural machine translation Translation granularity Subword model Character modelNotes
Acknowledgments
The research work has been funded by the Natural Science Foundation of China under Grant Nos. 61333018 and 61402478, and it is also supported by the Strategic Priority Research Program of the CAS under Grant No. XDB02070007.
References
- 1.Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Proceedings of ICLR 2015 (2015)Google Scholar
- 2.Cheng, Y., Liu, Y., Yang, Q., Sun, M., Xu, W.: Joint training for pivot-based neural machine translation. arXiv preprint arXiv:1611.04928v2 (2017)
- 3.Chiang, D.: A hierarchical phrase-based model for statistical machine translation. In: Proceedings of ACL 2005 (2005)Google Scholar
- 4.Chung, J., Cho, K., Bengio, Y.: A character-level decoder without explicit segmentation for neural machine translation (2016)Google Scholar
- 5.Gage, P.: A New Algorithm for Data Compression. R & D Publications, Inc., Lawrence (1994)Google Scholar
- 6.He, W., He, Z., Wu, H., Wang, H.: Improved neural machine translation with SMT features. In: Proceedings of AAAI 2016 (2016)Google Scholar
- 7.Jean, S., Cho, K., Memisevic, R., Bengio, Y.: On using very large target vocabulary for neural machine translation. Computer Science (2014)Google Scholar
- 8.Junczys-Dowmunt, M., Dwojak, T., Hoang, H.: Is neural machine translation ready for deployment? A case study on 30 translation directions. In: Proceedings of IWSLT 2016 (2016)Google Scholar
- 9.Kalchbrenner, N., Blunsom, P.: Recurrent continuous translation models. In: Proceedings of EMNLP 2013 (2013)Google Scholar
- 10.Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. Computer Science (2014)Google Scholar
- 11.Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: Proceedings of ACL-NAACL 2013 (2003)Google Scholar
- 12.Li, X., Zhang, J., Zong, C.: Towards zero unknown word in neural machine translation. In: Proceedings of IJCAI 2016 (2016)Google Scholar
- 13.Luong, M.T., Manning, C.D.: Achieving open vocabulary neural machine translation with hybrid word-character models (2016)Google Scholar
- 14.Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Proceedings of EMNLP 2015 (2015)Google Scholar
- 15.Luong, M.T., Sutskever, I., Le, Q.V., Vinyals, O., Zaremba, W.: Addressing the rare word problem in neural machine translation. In: Proceedings of ACL 2015 (2015)Google Scholar
- 16.Meng, F., Lu, Z., Li, H., Liu, Q.: Interactive attention for neural machine translation. In: Proceedings of COLING 2016 (2016)Google Scholar
- 17.Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: a method for automatic evaluation of machine translation. In Proceedings of ACL 2002 (2002)Google Scholar
- 18.Schuster, M., Nakajima, K.: Japanese and Korean voice search, vol. 22, no. 10, pp. 5149–5152 (2012)Google Scholar
- 19.Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of ACL 2016 (2016)Google Scholar
- 20.Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of NIPS 2014 (2014)Google Scholar
- 21.Wang, X., Lu, Z., Tu, Z., Li, H., Xiong, D., Zhang, M.: Neural machine translation advised by statistical machine translation. In: Proceedings of AAAI 2017 (2017)Google Scholar
- 22.Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Mohammad Norouzi, et al.: Googles neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)
- 23.Zhai, F., Zhang, J., Zhou, Y., Zong, C., et al.: Tree-based translation without using parse trees. In: Proceedings of COLING 2012 (2012)Google Scholar
- 24.Zhang, J., Zong, C.: Bridging neural machine translation and bilingual dictionaries. arXiv preprint arXiv:1610.07272 (2016)
- 25.Zhou, L., Hu, W., Zhang, J., Zong, C.: Neural system combination for machine translation. arXiv preprint arXiv:1704.06393 (2017)