Advertisement

Neural Computing and Applications

, Volume 32, Issue 1, pp 41–49 | Cite as

Research on the LSTM Mongolian and Chinese machine translation based on morpheme encoding

  • Ren Qing-dao-er-ji
  • Yi La SuEmail author
  • Wan Wan Liu
S.I. : Brain- Inspired computing and Machine learning for Brain Health
  • 77 Downloads

Abstract

The neural machine translation model based on long short-term memory (LSTM) has become the mainstream in machine translation with its unique coding–decoding structure and semantic mining features. However, there are few studies on the Mongolian and Chinese neural machine translation combined with LSTM. This paper mainly studies the preprocessing of Mongolian and Chinese bilingual corpus and the construction of the LSTM model of Mongolian morpheme coding. In the corpus preprocessing stage, this paper presents a hybrid algorithm for the construction of word segmentation modules. The sequence that has not been annotated is treated semantically and labeled by a combination of gated recurrent unit and conditional random field. In order to learn more grammar and semantic knowledge from Mongolian corpus, in the model construction stage, this paper presents the LSTM neural network model based on morpheme coding to construct the encoder. This paper also constructs the LSTM neural network decoder to predict the Chinese decode. Experimental comparisons of sentences of different lengths according to the construction model show that the model has improved translation performance in dealing with long-term dependence problems.

Keywords

Mongolian and Chinese machine translation GRU-CRF algorithm LSTM neural network Neural machine translation 

Notes

Acknowledgements

This work was financially supported by the Natural Science Foundation of Inner Mongolia (2018MS06021, 2016MS0605), the Foundation of Autonomous regional civil committee of Inner Mongolia (MW-2017-MGYWXXH-03), the Inner Mongolia Scientific and technological innovation guide reward funds project: Facilities Agricultural IOT key equipment and system development and industrialization demonstration, the Inner Mongolia Science and Technology Plan Project (201502015).

References

  1. 1.
    Sreelekha S, Bhattacharyya P, Jha SK et al (2016) A survey report on evolution of machine translation. Int J Control Theory Appl 9(33):233–240Google Scholar
  2. 2.
    Kituku B, Muchemi L, Nganga W (2016) A review on machine translation approaches. Indones J Electr Eng Comput Sci 1(1):182–190CrossRefGoogle Scholar
  3. 3.
    Singh SP, Kumar A, Darbari H (2017) Machine translation using deep learning: an overview. Int Conf Comput Commun Electron IEEE 2:162–167Google Scholar
  4. 4.
    Cho K, VanMerrienboer B, Bahdanau D et al (2014) On the properties of neural machine translation: encoder–decoder approaches. Comput Sci 1:103–111Google Scholar
  5. 5.
    Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. Comput Sci 4:1–15Google Scholar
  6. 6.
    Wu Y, Schuster M, Chen Z et al (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv:1609.08144
  7. 7.
    Klein G, Kim Y, Deng Y, et al (2017) OpenNMT: open-source toolkit for neural machine translation. In: ACL 2017 system demonstrations, pp 67–72Google Scholar
  8. 8.
    Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw Off J Int Neural Netw Soc 18(5):602–610CrossRefGoogle Scholar
  9. 9.
    Chung J, Gulcehre C, Cho K et al (2015) Gated feedback recurrent neural networks. Comput Sci 2067–2075Google Scholar
  10. 10.
    Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. Comput Sci 5(1):36Google Scholar
  11. 11.
    Bahdanau D, Chorowski J, Serdyuk D et al (2016) End-to-end attention-based large vocabulary speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processin. IEEE, pp. 4945–4949Google Scholar
  12. 12.
    Ren Zhihui Xu, Haoyu Feng Songlin (2017) Sequence labeling chinese word segmentation method based on LSTM networks. Appl Res Comput 34(5):1321–1324Google Scholar
  13. 13.
    Huang Jiyang (2016) Chinese word segmentation analysis based on bidirectional LSTMN recurrent neural network. Nanjing University, JiangsuGoogle Scholar
  14. 14.
    Zoph B, Knight K (2016) Multi-source neural translation. In: The conference of the North American chapter of the association for computational linguistics[C], human language technologies, pp 30–34Google Scholar
  15. 15.
    Barone AVM (2016) Low-rank passthrough neural networks. arXiv:1603.03116v3
  16. 16.
    Cui Y, Wang S, Li J (2015) LSTM neural reordering feature for statistical machine translation. Comput Sci (2):977–982Google Scholar
  17. 17.
    Xia Y, Wang S (2014) A research on constructing mongolian tree bank based on phrase structure grammar. In: 2014 international conference on progress in informatics and computing (PIC), IEEE, pp 51–54Google Scholar
  18. 18.
    Wu J, Hou H, Shen Z et al (2016) Adapting attention-based neural network to low-resource Mongolian–Chinese machine translation. Natural Lang Underst Intell Appl 470–480Google Scholar

Copyright information

© The Natural Computing Applications Forum 2018

Authors and Affiliations

  1. 1.School of Information EngineeringInner Mongolia University of TechnologyHohhotChina

Personalised recommendations