Skip to main content

Research on Chinese-Tibetan Neural Machine Translation

  • Conference paper
  • First Online:
Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data (CCL 2018, NLP-NABD 2018)

Abstract

At present, the research on Tibetan machine translation is mainly focused on Tibetan-Chinese machine translation and the research on Chinese-Tibetan machine translation is almost blank. In this paper, the neural machine translation model is applied to the Chinese-Tibetan machine translation task for the first time, the syntax tree is also introduced into the Chinese-Tibetan neural machine translation model for the first time, and a good translation effect is achieved. Besides, the preprocessing methods we use are syllable segmentation on Tibetan corpus and character segmentation on Chinese Corpus, which has a better performance than the word segmentation on both Chinese and Tibetan corpus. The experimental results show that performance of the neural network translation model based on the completely self-attention mechanism is the best in the Chinese-Tibetan machine translation task and the BLEU score is increased by one percentage point.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Liu, Y.: Recent advances in neural machine translation. J. Comput. Res. Dev. 54(6), 1144–1149 (2017). (in Chinese)

    Google Scholar 

  • Zhao, T.: Machine Translation Theory. Harbin Institute of Technology Press (1900). (in Chinese)

    Google Scholar 

  • Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MathSciNet  Google Scholar 

  • Wei, S.: Research on Tibetan - Chinese online translation system based on phrases. Dissertation, Northwest University for Nationalities (2015). (in Chinese)

    Google Scholar 

  • Cai, Z., Cai, R.: Research on the distribution of tibetan character forms. J. Chin. Inf. Process. 30(4), 98–105 (2016). (in Chinese)

    MathSciNet  Google Scholar 

  • Dong, X., Cao, H., Jiang, T.: Phrase based Tibetan - Chinese statistical machine translation system. Technol. Wind 17, 60–61 (2012). (in Chinese)

    Google Scholar 

  • Luo, X.: Research on syntax-based Chinese-Tibetan statistical machine translation system. Dissertation, Xiamen University (2010). (in Chinese)

    Google Scholar 

  • Hua, Q.: Research on some key technologies of machine translation based on tree-to-string in tibetan language. Dissertation, Shanxi Normal University (2014a). (in Chinese)

    Google Scholar 

  • Cai, R.: Research on large-scale Sino-Tibetan bilingual corpus construction for natural language processing. J. Chin. Inf. Process. 25(6), 157–162 (2011)

    Google Scholar 

  • Pang, W.: Research on the construction technology of Tibetan-Chinese bilingual corpus of corpus based on web. Dissertation, Minzu University of China (2015). (in Chinese)

    Google Scholar 

  • Xiang, B., Zhang, G.: Research on the translation of han names in Chinese-Tibetan machine translation. J. Qinghai Normal Univ. (Nat. Sci.) 27(4), 88–90 (2011)

    Google Scholar 

  • Hua, G.: Tibetan verb researching in Chinese Tibetan machine translation. Dissertation, Qinghai Normal University (2014b). (in Chinese)

    Google Scholar 

  • Nuo, M., Wu, J., Liu, H., Ding, Z.: Research on phrase translation extraction for Chinese-Tibetan machine translation. J. Chin. Inf. Process. 25(3), 112–118 (2011)

    Google Scholar 

  • Li, Y., Xiong, D., Zhang, M., Jiang, J., Ma, N., Yin, J.: Research on Tibetan-Chinese neural machine translation. J. Chin. Inf. Process. 31(6), 103–109 (2017a)

    Google Scholar 

  • Guan, Q.: Research on Tibetan segmentation for machine translation. Electron. Test 11x, 46–48 (2015)

    Google Scholar 

  • Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (2014)

    Google Scholar 

  • Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. Comput. Sci. (2014)

    Google Scholar 

  • D’Informatique, D.E., Ese, N., Esent, P., et al.: Long short-term memory in recurrent neural networks. EPFL 9(8), 1735–1780 (2001)

    Google Scholar 

  • Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. Comput. Sci. (2014)

    Google Scholar 

  • Gehring, J., Auli, M., Grangier, D., et al.: Convolutional Sequence to Sequence Learning (2017)

    Google Scholar 

  • Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems (2017)

    Google Scholar 

  • Xue, Y., Li, S., Zhao, T., Yang, M.: Syntax-based reordering model for phrasal statistical machine translation. J. Commun. Test 29(1), 7–14 (2008)

    Google Scholar 

  • Xiong, D., Liu, Q., Lin, S.: A survey of syntax-based statistic machine translation. J. Chin. Inf. Process. 22(2), 28–39 (2008)

    Google Scholar 

  • Chen, H., et al.: Improved neural machine translation with a syntax-aware encoder and decoder, pp. 1936–1945 (2017)

    Google Scholar 

  • Eriguchi, A., Tsuruoka, Y., Cho, K.: Learning to Parse and Translate Improves Neural Machine Translation (2017)

    Google Scholar 

  • Li, J., et al.: Modeling Source Syntax for Neural Machine Translation, pp. 688–697 (2017b)

    Google Scholar 

  • Aharoni, R., Goldberg, Y.: Towards String-to-Tree Neural Machine Translation (2017)

    Google Scholar 

  • Xiao, T., et al.: NiuTrans: an open source toolkit for phrase-based and syntax-based machine translation. In: Proceedings of the ACL 2012 System Demonstrations. Association for Computational Linguistics (2012)

    Google Scholar 

  • Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909 (2015)

  • Papineni, K., et al.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics (2002)

    Google Scholar 

  • Li, Y., et al.: TIP-LAS: an open source toolkit for Tibetan word segmentation and POS tagging. J. Chin. Inf. Process. 29(6), 203–207 (2015)

    Article  Google Scholar 

  • Li, Z., Sun, M.: Punctuation as implicit annotations for chinese word segmentation. Comput. Linguist. 35(4), 505–512 (2009)

    Article  Google Scholar 

  • Petrov, S., et al.: Learning accurate, compact, and interpretable tree annotation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics (2006)

    Google Scholar 

Download references

Acknowledgement

This work is supported by the National Science Foundation of China (61331013).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaobing Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lai, W., Zhao, X., Li, X. (2018). Research on Chinese-Tibetan Neural Machine Translation. In: Sun, M., Liu, T., Wang, X., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. CCL NLP-NABD 2018 2018. Lecture Notes in Computer Science(), vol 11221. Springer, Cham. https://doi.org/10.1007/978-3-030-01716-3_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01716-3_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01715-6

  • Online ISBN: 978-3-030-01716-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics