Abstract
At present, the research on Tibetan machine translation is mainly focused on Tibetan-Chinese machine translation and the research on Chinese-Tibetan machine translation is almost blank. In this paper, the neural machine translation model is applied to the Chinese-Tibetan machine translation task for the first time, the syntax tree is also introduced into the Chinese-Tibetan neural machine translation model for the first time, and a good translation effect is achieved. Besides, the preprocessing methods we use are syllable segmentation on Tibetan corpus and character segmentation on Chinese Corpus, which has a better performance than the word segmentation on both Chinese and Tibetan corpus. The experimental results show that performance of the neural network translation model based on the completely self-attention mechanism is the best in the Chinese-Tibetan machine translation task and the BLEU score is increased by one percentage point.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Liu, Y.: Recent advances in neural machine translation. J. Comput. Res. Dev. 54(6), 1144–1149 (2017). (in Chinese)
Zhao, T.: Machine Translation Theory. Harbin Institute of Technology Press (1900). (in Chinese)
Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Wei, S.: Research on Tibetan - Chinese online translation system based on phrases. Dissertation, Northwest University for Nationalities (2015). (in Chinese)
Cai, Z., Cai, R.: Research on the distribution of tibetan character forms. J. Chin. Inf. Process. 30(4), 98–105 (2016). (in Chinese)
Dong, X., Cao, H., Jiang, T.: Phrase based Tibetan - Chinese statistical machine translation system. Technol. Wind 17, 60–61 (2012). (in Chinese)
Luo, X.: Research on syntax-based Chinese-Tibetan statistical machine translation system. Dissertation, Xiamen University (2010). (in Chinese)
Hua, Q.: Research on some key technologies of machine translation based on tree-to-string in tibetan language. Dissertation, Shanxi Normal University (2014a). (in Chinese)
Cai, R.: Research on large-scale Sino-Tibetan bilingual corpus construction for natural language processing. J. Chin. Inf. Process. 25(6), 157–162 (2011)
Pang, W.: Research on the construction technology of Tibetan-Chinese bilingual corpus of corpus based on web. Dissertation, Minzu University of China (2015). (in Chinese)
Xiang, B., Zhang, G.: Research on the translation of han names in Chinese-Tibetan machine translation. J. Qinghai Normal Univ. (Nat. Sci.) 27(4), 88–90 (2011)
Hua, G.: Tibetan verb researching in Chinese Tibetan machine translation. Dissertation, Qinghai Normal University (2014b). (in Chinese)
Nuo, M., Wu, J., Liu, H., Ding, Z.: Research on phrase translation extraction for Chinese-Tibetan machine translation. J. Chin. Inf. Process. 25(3), 112–118 (2011)
Li, Y., Xiong, D., Zhang, M., Jiang, J., Ma, N., Yin, J.: Research on Tibetan-Chinese neural machine translation. J. Chin. Inf. Process. 31(6), 103–109 (2017a)
Guan, Q.: Research on Tibetan segmentation for machine translation. Electron. Test 11x, 46–48 (2015)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (2014)
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. Comput. Sci. (2014)
D’Informatique, D.E., Ese, N., Esent, P., et al.: Long short-term memory in recurrent neural networks. EPFL 9(8), 1735–1780 (2001)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. Comput. Sci. (2014)
Gehring, J., Auli, M., Grangier, D., et al.: Convolutional Sequence to Sequence Learning (2017)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems (2017)
Xue, Y., Li, S., Zhao, T., Yang, M.: Syntax-based reordering model for phrasal statistical machine translation. J. Commun. Test 29(1), 7–14 (2008)
Xiong, D., Liu, Q., Lin, S.: A survey of syntax-based statistic machine translation. J. Chin. Inf. Process. 22(2), 28–39 (2008)
Chen, H., et al.: Improved neural machine translation with a syntax-aware encoder and decoder, pp. 1936–1945 (2017)
Eriguchi, A., Tsuruoka, Y., Cho, K.: Learning to Parse and Translate Improves Neural Machine Translation (2017)
Li, J., et al.: Modeling Source Syntax for Neural Machine Translation, pp. 688–697 (2017b)
Aharoni, R., Goldberg, Y.: Towards String-to-Tree Neural Machine Translation (2017)
Xiao, T., et al.: NiuTrans: an open source toolkit for phrase-based and syntax-based machine translation. In: Proceedings of the ACL 2012 System Demonstrations. Association for Computational Linguistics (2012)
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909 (2015)
Papineni, K., et al.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics (2002)
Li, Y., et al.: TIP-LAS: an open source toolkit for Tibetan word segmentation and POS tagging. J. Chin. Inf. Process. 29(6), 203–207 (2015)
Li, Z., Sun, M.: Punctuation as implicit annotations for chinese word segmentation. Comput. Linguist. 35(4), 505–512 (2009)
Petrov, S., et al.: Learning accurate, compact, and interpretable tree annotation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics (2006)
Acknowledgement
This work is supported by the National Science Foundation of China (61331013).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Lai, W., Zhao, X., Li, X. (2018). Research on Chinese-Tibetan Neural Machine Translation. In: Sun, M., Liu, T., Wang, X., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. CCL NLP-NABD 2018 2018. Lecture Notes in Computer Science(), vol 11221. Springer, Cham. https://doi.org/10.1007/978-3-030-01716-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-01716-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01715-6
Online ISBN: 978-3-030-01716-3
eBook Packages: Computer ScienceComputer Science (R0)