A Part-of-Speech Enhanced Neural Conversation Model

  • Chuwei Luo
  • Wenjie LiEmail author
  • Qiang Chen
  • Yanxiang He
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10193)


Modeling syntactic information of sentences is essential for neural response generation models to produce appropriate response sentences of high linguistic quality. However, no previous work in conversational responses generation using sequence-to-sequence (Seq2Seq) neural network models has reported to take the sentence syntactic information into account. In this paper, we present two part-of-speech (POS) enhanced models that incorporate the POS information into the Seq2Seq neural conversation model. When training these models, corresponding POS tag is attached to each word in the post and the response so that the word sequences and the POS tag sequences can be interrelated. By the time the word in a response is to be generated, it is constrained by the expected POS tag. The experimental results show that the POS-enhanced Seq2Seq models can generate more grammatically correct and appropriate responses in terms of both perplexity and BLEU measures when compared with the word Seq2Seq model.


Response generation Seq2Seq neural conversation model Syntactic information incorporating 



The work described in this paper was supported by National Natural Science Foundation of China (61272291 and 61672445) and The Hong Kong Polytechnic University (G-YBP6, 4-BCB5 and B-Q46C).


  1. 1.
    Ritter, A., Cherry, C., Dolan, W.B.: Data-driven response generation in social media. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 583–593 (2011)Google Scholar
  2. 2.
    Sordoni, A., Galley, M., Auli, M., Brockett, C., Ji, Y., Mitchell, M., Nie, J.-Y., Gao, J., Dolan, B.: A neural network approach to context-sensitive generation of conversational responses. In: Proceedings of NAACL-HLT (2015)Google Scholar
  3. 3.
    Vinyals, O., Le, Q.: A neural conversational model. In: Proceedings of ICML Deep Learning Workshop (2015)Google Scholar
  4. 4.
    Shang, L., Zhengdong, L., Li, H.: Neural responding machine for short-text conversation. In: ACL-IJCNLP, pp. 1577–1586 (2015)Google Scholar
  5. 5.
    Serban, I.V., Sordoni, A., Bengio, Y., Courville, A., Pineau, J.: Building end-to-end dialogue systems using generative hierarchical neural network models. In: Proceedings of AAAI (2016)Google Scholar
  6. 6.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 3104–3112 (2015)Google Scholar
  7. 7.
    Wen, T.-H., Gasic, M., Mrksic, N., Pei-Hao, S., Vandyke, D., Young, S.: Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. In: Proceedings of EMNLP, pp. 1711–1721 (2015)Google Scholar
  8. 8.
    Li, J., Galley, M., Brockett, C., Gao, J., Dolan, B.: A diversity-promoting objective function for neural conversation models. In: NAACL-HLT (2016a)Google Scholar
  9. 9.
    Li, J., Galley, M., Brockett, C., Gao, J., Dolan, B.: A persona-based neural conversation model. In: Proceedings of ACL (2016b)Google Scholar
  10. 10.
    Wang, H., Zhengdong, L., Li, H., Chen, E.: A dataset for research on short-text conversations. In: Proceedings of EMNLP, pp. 935–945 (2013)Google Scholar
  11. 11.
    Yao, K., Zweig, G., Peng, B.: Attention with intention for a neural network conversation model. In: NIPS Workshop on Machine Learning for Spoken Language Understanding and Interaction (2015)Google Scholar
  12. 12.
    Wen, T.-H., Vandyke, D., Mrksic, N., Gasic, M., Rojas-Barahona, L.M., Pei-Hao, S., Ultes, S., Young, S.: A network-based end-to-end trainable task-oriented dialogue system. arXiv preprint (2016). arXiv:1604.04562
  13. 13.
    Gu, J., Lu, Z., Li, H., Li, V.O.K.: Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of ACL (2016)Google Scholar
  14. 14.
    Luan, Y., Ji, Y., Ostendorf, M.: LSTM based conversation models. arXiv preprint (2016). arXiv:1603.09457
  15. 15.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  16. 16.
    Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint (2014). arXiv:1412.3555
  17. 17.
    Cho, K., van Merrienboer, B., Gulcehre, C., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint (2014). arXiv:1406.1078
  18. 18.
    Serban, I.V., Lowe, R., Charlin, L., Pineau, J.: A survey of available corpora for building data-driven dialogue systems. arXiv preprint (2015). arXiv:1512.05742
  19. 19.
    Greff, K., Srivastava, R.K., Koutník, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: a search space Odyssey. CoRR, abs/1503.04069 (2015)Google Scholar
  20. 20.
    Dong, D., Wu, H., He, W., Yu, D., Wang, H.: Multi-task learning for multiple language translation. In: Proceedings of ACL (2015)Google Scholar
  21. 21.
    Luong, M.-T., Le, Q.V., Sutskever, I., Vinyals, O., Kaiser, L.: Multi-task sequence to sequence learning. In: Proceedings of ICLR (2016)Google Scholar
  22. 22.
    Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRefGoogle Scholar
  23. 23.
    Shang, L., Sakai, T., Zhengdong, L., Li, H., Higashinaka, R., Miyao, Y.: Overview of the NTCIR-12 short text conversation task. In: NTCIR-2012 (2016)Google Scholar
  24. 24.
    Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of ACL, pp. 311–318 (2002)Google Scholar
  25. 25.
    Galley, M., Brockett, C., Sordoni, A., Ji, Y., Auli, M., Quirk, C., Mitchell, M., Gao, J., Dolan, B.: deltaBLEU: a discriminative metric for generation tasks with intrinsically diverse targets. CoRR, abs/1506.06863 (2015)Google Scholar
  26. 26.
    Pietquin, O., Hastie, H.: A survey on metrics for the evaluation of user simulations. Knowl. Eng. Rev. 28(01), 59–73 (2013)CrossRefGoogle Scholar
  27. 27.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)zbMATHGoogle Scholar
  28. 28.
    Hoffman, M.D., Blei, D.M., Bach, F.: Online learning for latent Dirichlet allocation. Adv. Neural Inf. Process. Syst. 23, 856–864 (2010)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Chuwei Luo
    • 1
  • Wenjie Li
    • 2
    Email author
  • Qiang Chen
    • 1
  • Yanxiang He
    • 1
  1. 1.School of Computer ScienceWuhan UniversityWuhanChina
  2. 2.Department of ComputingThe Hong Kong Polytechnic UniversityKowloon TongHong Kong

Personalised recommendations