Abstract
One of difficulties in Chinese-English machine translation is that the grammatical meaning expressed by morphology or syntax in target translations is usually determined by Chinese function words or word order. In order to address this issue, we develop classifiers to automatically detect usages of common Chinese function words based on Chinese Function usage Knowledge Base (CFKB) and initially propose a function word usage embedding model to incorporate detection results into neural machine translation (NMT). Experiments on the NIST Chinese-English translation task demonstrate that the proposed method can obtain significant improvements on the quality of both translation and word alignment over the NMT baseline.
This work is partially supported by National Basic Research Program of China (2014CB340504), National Natural Science Foundation of China (No. 61402419, No. 60970083), National Social Science Foundation (No. 14BYY096), Basic research project of Science and Technology Department of Henan Province (No. 142300410231, No. 142300410308) and science and technology project of Science and Technology Department of Henan Province (No. 172102210478).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Extracted from LDC2002E18, LDC2003E07, LDC2003E14, Hansards portion of LDC2004T07, LDC2004T08 and LDC2005T06.
- 2.
We observe that the “concat” and “usage” method perform better than the “part” method in terms of both translation and alignment quality in “DE”’s experiment. Therefore we don’t have the experiments of “part” method with the other function words.
- 3.
After the Chinese sentence is its reference translation, “BASE” means baseline model, “FNMT” means model with function word usages, text after “/” is the usage of Chinese function words.
References
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. Comput. Sci. (2014)
Chang, P.C., Jurafsky, D., Manning, C.D.: Disambiguating “DE" for Chinese-English machine translation. In: The Workshop on Statistical Machine Translation, pp. 215–223 (2009)
Chen, X., Qiu, X., Zhu, C., Liu, P., Huang, X.: Long short-term memory neural networks for Chinese word segmentation. In: Conference on Empirical Methods in Natural Language Processing, pp. 1197–1206 (2015)
Cho, K., Van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. Comput. Sci. (2014)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. Comput. Sci. (2014)
Klein, G., Kim, Y., Deng, Y., Senellart, J., Rush, A.M.: OpenNMT: open-source toolkit for neural machine translation (2017)
Liu, Q., Zhang, K., Xu, H., Zan, H., Yu, S.: Research on automatic recognition of auxiliary de. In: Proceedings of CLSW2017 (2017). (in Chinese)
Liu, Y., Sun, M.,: Contrastive unsupervised word alignment with non-local features. In: Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 2295–2301 (2015)
Lu, J.M., Ma, Z.: Scattered Essays of Modern Chinese Function Words. Language and Culture Press, Beijing (1999). (in Chinese)
Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. Comput. Sci. (2015)
Lv, S.X., Zhu, D.X.: Grammatical Rhetoric. Liaoning Education Press, Beijing (2002). (in Chinese)
Sennrich R., Haddow, B.: Linguistic input features improve neural machine translation, pp. 83–91 (2016)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, vol. 4, pp. 3104–3112 (2014)
Tu, Z., Lu, Z., Liu, Y., Liu, X., Li, H.: Modeling coverage for neural machine translation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 76–85 (2016)
Shiwen, Y., Zhu, X., Liu, Y.: Knowledge-base of generalized functional words of contemporary chinese. J. Chin. Lang. Comput. 13, 89–98 (2003)
Zan, H., Zhang, J.: Studies on automatic recognition of chinese adverb CAI’s usages based on statistics. In: 2009 International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE, pp. 1–5 (2009). (in Chinese)
Zan, H., Zhu, X.: Research on the chinese function word usage knowledge base. Int. J. Asian Lang. Process. 21(4), 185–198 (2011). (in Chinese)
Zhang, J., Zan, H.: Automatic recognition research on chinese adverb DOU’s usages. Acta Scientiarum Naturalium Universitatis Pekinensis 49(1), 165–169 (2013). (in Chinese)
Zhang, K., Zan, H., Chai, Y., Han, Y., Zhao, D.: Survey of the Chinese function word usage knowledge base. J. Chin. Inf. Process. 29(3), 1–8 (2015). (in Chinese)
Zhang, K., Zan, H., Han, Y., Zhang, T.: Studies on automatic recognition of contemporary chinese common preposition usage. In: Ji, D., Xiao, G. (eds.) CLSW 2012. LNCS (LNAI), vol. 7717, pp. 219–229. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36337-5_23
Acknowledgement
We thank the anonymous reviewers for their insightful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Zhang, K., Xu, H., Xiong, D., Liu, Q., Zan, H. (2018). Improving Chinese-English Neural Machine Translation with Detected Usages of Function Words. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2017. Lecture Notes in Computer Science(), vol 10619. Springer, Cham. https://doi.org/10.1007/978-3-319-73618-1_64
Download citation
DOI: https://doi.org/10.1007/978-3-319-73618-1_64
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73617-4
Online ISBN: 978-3-319-73618-1
eBook Packages: Computer ScienceComputer Science (R0)