Exploiting Knowledge Graph in Neural Machine Translation

  • Yu LuEmail author
  • Jiajun Zhang
  • Chengqing Zong
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 954)


Neural machine translation (NMT) can achieve promising translation quality on resource-rich languages due to end-to-end learning. However, the widely-used NMT system only focuses on modeling the inner mapping from source to target without resorting to external knowledge. In this paper, we take English-Chinese translation as a case study to exploit the use of knowledge graph (KG) in NMT. The main idea is utilizing the entity relations in knowledge graph as constraints to enhance the connections between the source words and their translations. Specifically, we design two kinds of constraints. One is monolingual constraint that employs the entity relations in KG to augment the semantic representation of the source words. The other is bilingual constraint which enforces the entity relations between the source words to be shared by their translations. In this way, external knowledge can participate in the translation process and help to model semantic relationships between source and target words. Experimental results demonstrate that our method outperforms the state-of-the-art system.


Neural machine translation Knowledge-constrain Knowledge graph 



The research work described in this paper has been supported by the National Key Research and Development Program of China under Grant No. 2016QY02D0303 and the Natural Science Foundation of China under Grant No. 61673380.


  1. 1.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). Scholar
  2. 2.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: International Conference on Learning Representations (2015)Google Scholar
  3. 3.
    Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: SIGMOD Conference, pp. 1247–1250 (2008)Google Scholar
  4. 4.
    Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: International Conference on Neural Information Processing Systems, pp. 2787–2795 (2013)Google Scholar
  5. 5.
    Dong, X., et al.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 601–610 (2014)Google Scholar
  6. 6.
    Dyer, C., Chahuneau, V., Smith, N.A.: A simple, fast, and effective reparameterization of IBM Model 2. In: Proceedings of the NAACL (2013)Google Scholar
  7. 7.
    Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. arXiv:1705.03122 (2017)
  8. 8.
    Li, S., Xu, J., Miao, G., Zhang, Y., Chen, Y.: A semantic concept based unknown words processing method in neural machine translation. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Y. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 233–242. Springer, Cham (2018). Scholar
  9. 9.
    Lin, Y., Liu, Z., Sun, M., Liu, Y., Zhu, X.: Learning entity and relation embeddings for knowledge graph completion. In: Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 2181–2187 (2015)Google Scholar
  10. 10.
    Mikolov, T., Yih, W.T., Zweig, G.: Linguistic regularities in continuous space word representations. In: HLT-NAACL (2013)Google Scholar
  11. 11.
    Nickel, M., Rosasco, L., Poggio, T.: Holographic embeddings of knowledge graphs. In: National Conference on Artificial Intelligence, pp. 1955–1961 (2016)Google Scholar
  12. 12.
    Papineni, K.: Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Meeting of the Association for Computational Linguistics, vol. 4, no. 4, pp. 307–318 (2001)Google Scholar
  13. 13.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: International Conference on Neural Information Processing Systems, pp. 3104–3112 (2014)Google Scholar
  14. 14.
    Trouillon, T., Welbl, J., Riedel, S., Gaussier, E., Bouchard, G.: Complex embeddings for simple link prediction. In: International Conference on Machine Learning, pp. 2071–2080 (2016)Google Scholar
  15. 15.
    Vaswani, A., et al.: Attention is all you need. arXiv:1706.03762v5 (2017)
  16. 16.
    Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph embedding by translating on hyperplanes. In: Twenty-Eighth AAAI Conference on Artificial Intelligence, pp. 1112–1119 (2014)Google Scholar
  17. 17.
    Xiao, H., Huang, M., Hao, Y., Zhu, X.: TransA: an adaptive approach for knowledge graph embedding. Computer Science (2015)Google Scholar
  18. 18.
    Xiao, H., Huang, M., Zhu, X.: TransG: a generative model for knowledge graph embedding. In: Meeting of the Association for Computational Linguistics, pp. 2316–2325 (2016)Google Scholar
  19. 19.
    Zhang, J., Liu, S., Li, M., Zhou, M., Zong, C.: Bilingually-constrained phrase embeddings for machine translation. In: Meeting of the Association for Computational Linguistics, vol. 1, pp. 111–121 (2014)Google Scholar
  20. 20.
    Zhang, J., Liu, S., Li, M., Zhou, M., Zong, C.: Mind the gap: machine translation by minimizing the semantic gap in embedding space. In: National Conference on Artificial Intelligence, pp. 1657–1663 (2014)Google Scholar
  21. 21.
    Zhang, J., Zong, C.: Bridging neural machine translation and bilingual dictionaries. Computation and Language (2016)Google Scholar
  22. 22.
    Zou, W.Y., Socher, R., Cer, D.M., Manning, C.D.: Bilingual word embeddings for phrase-based machine translation. In: Empirical Methods in Natural Language Processing, pp. 1393–1398 (2013)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.National Laboratory of Pattern Recognition, Institute of AutomationCASBeijingChina
  2. 2.University of Chinese Academy of SciencesBeijingChina
  3. 3.CAS Center for Excellence in Brain Science and Intelligence TechnologyBeijingChina

Personalised recommendations