A Statistical Method for Translating Chinese into Under-resourced Minority Languages

  • Lei Chen
  • Miao Li
  • Jian Zhang
  • Zede Zhu
  • Zhenxin Yang
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 493)


In order to improve the performance of statistical machine translation between Chinese and minority languages, most of which are under-resourced languages with different word order and rich morphology, the paper proposes a method which incorporates syntactic information of the source-side and morphological information of the target-side to simultaneously reduce the differences of word order and morphology. First, according to the word alignment and the phrase structure trees of source language, reordering rules are extracted automatically to adjust the word order at source side. And then based on Hidden Markov Model, a morphological segmentation method is adopted to obtain morphological information of the target language. In the experiments, we take the Chinese-Mongolian translation as an example. A morpheme-level statistical machine translation system, constructed based on the reordered source side and the segmented target side, achieves 2.1 BLEU points increment over the standard phrase-based system.


Under-resourced languages Mongolian Reordering Morphological segmentation Machine translation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cao, H., Zhang, D., Li, M., Zhou, M., Zhao, T.: A Lexicalized Reordering Model for Hierarchical Phrase-based Translation. In: COLING, pp. 1144–1153 (2014)Google Scholar
  2. 2.
    Chen, L., Li, M., Zhang, J., Zeng, W.: Reordering for Chinese-Mongolian SMT Based on Small Parallel Corpus. Journal of Chinese Information Processing 27(5), 198–204 (2013) (in Chinese)Google Scholar
  3. 3.
    Doddington, G.R.: Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics. In: HLT, pp. 138–145 (2002)Google Scholar
  4. 4.
    Feng, M., Peter, J.T., Ney, H.: Advancements in Reordering Models for Statistical Machine Translation. In: ACL, pp. 322–332 (2013)Google Scholar
  5. 5.
    Hou, H., Liu, Q., Li, J.: A Phrase-based Statistical Chinese-Mongolian Machine Translation and Reordering Model. Chinese High Technology Letters 19(5), 475–479 (2009) (in Chinese)Google Scholar
  6. 6.
    Hou, H., Liu, Q., Nasanurtu, Murengaowa, Li, J.: Mongolian Word Segmentation Based on Statistical Language Model. Pattem Recognition and Aitificial Intelligence 22(1), 108–112 (2009) (in Chinese)Google Scholar
  7. 7.
    Khalilov, M., Sima’an, K.: Context-Sensitive Syntactic Source-Reordering by Statistical Transduction. In: IJCNLP, pp. 38–46 (2011)Google Scholar
  8. 8.
    Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: ACL, pp. 423–430 (2003)Google Scholar
  9. 9.
    Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open Source Toolkit for Statistical Machine Translation. In: ACL, pp. 177–180 (2007)Google Scholar
  10. 10.
    Koehn, P., Och, F.J., Marcu, D.: Statistical Phrase-Based Translation. In: HLT-NAACL, pp. 48–54 (2003)Google Scholar
  11. 11.
    Lee, Y.S., Zhao, B., Luo, X.: Constituent Reordering and Syntax Models for English-to-Japanese Statistical Machine Translation. In: COLING, pp. 626–634 (2010)Google Scholar
  12. 12.
    Li, J., Marton, Y., Resnik, P., Daume III, H.: A Unified Model for Soft Linguistic Reordering Constraints in Statistical Machine Translation. In: ACL, pp. 1123–1133 (2014)Google Scholar
  13. 13.
    Li, P., Liu, Y., Sun, M., Izuha, T., Zhang, D.: A Neural Reordering Model for Phrase-based Translation. In: COLING, pp. 1897–1907 (2014)Google Scholar
  14. 14.
    Li, W., Chen, L., Wudabala, Li, M.: Chained Machine Translation Using Morphemes as Pivot Language. In: Workshop on Asian Language Resouces at COLING, pp. 169–177 (2010)Google Scholar
  15. 15.
    Liang, F., Chen, L., Li, M., Nasun-urtu: A Rule-based Source-side Reordering on Phrase Structure Subtrees. In: IALP, pp. 173–176 (2011)Google Scholar
  16. 16.
    Luong, M.T., Nakov, P., Kan, M.Y.: A Hybrid Morpheme-Word Representation for Machine Translation of Morphologically Rich Languages. In: EMNLP, pp. 148–157 (2010)Google Scholar
  17. 17.
    Lv, Y.: Evaluation Summary of the 9th China Workshop on Machine Translation. In: CWMT (2013),
  18. 18.
    Och, F.J.: Minimum Error Rate Training in Statistical Machine Translation. In: ACL, pp. 160–167 (2003)Google Scholar
  19. 19.
    Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a Method for Automatic Evaluation of Machine Translation. In: ACL, pp. 311–318 (2002)Google Scholar
  20. 20.
    Poon, H., Cherry, C., Toutanova, K.: Unsupervised Morphological Segmentation with Log-Linear Models. In: HLT-NAACL, pp. 209–217 (2009)Google Scholar
  21. 21.
    Stolcke, A.: SRILM - An Extensible Language Modeling Toolkit. In: Proc. Intl. Conf. on Spoken Language Processing, pp. 901–904 (2002)Google Scholar
  22. 22.
    Visweswariah, K., Navratil, J., Sorensen, J., Chenthamarakshan, V., Kambhatla, N.: Syntax Based Reordering with Automatically Derived Rules for Improved Statistical Machine Translation. In: COLING, pp. 1119–1127 (2010)Google Scholar
  23. 23.
    Wangsiriguleng, Siqintu, Nasan-urtu: A Reordering Method of Chinese-Mongolian Statistical Machine Translation. Journal of Chinese Information Processing 25(4), 88–92 (2011) (in Chinese)Google Scholar
  24. 24.
    Yang, N., Li, M., Zhang, D., Yu, N.: A Ranking-based Approach to Word Reordering for Statistical Machine Translation. In: ACL, pp. 912–920 (2012)Google Scholar
  25. 25.
    Zhang, J., Zong, C.: A Framework for Effectively Integrating Hard and Soft Syntactic Rules into Phrase Based Translation. In: PACLIC, pp. 579–588 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Lei Chen
    • 1
  • Miao Li
    • 1
  • Jian Zhang
    • 1
  • Zede Zhu
    • 1
  • Zhenxin Yang
    • 1
  1. 1.Institute of Intelligent MachinesChinese Academy of SciencesHefeiChina

Personalised recommendations