An Empirical Study on Incorporating Prior Knowledge into BLSTM Framework in Answer Selection

  • Yahui LiEmail author
  • Muyun Yang
  • Tiejun Zhao
  • Dequan Zheng
  • Sheng Li
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10619)


Deep learning has become the state-of the art solution to answer selection. One distinguishing advantage of deep learning is that it avoids manual engineering via its end-to-end structure. But in the literature, substantial practices of introducing prior knowledge into the deep learning process are still observed with positive effect. Following this thread, this paper investigates the contribution of incorporating different prior knowledge into deep learning via an empirical study. Under a typical BLSTM framework, 3 levels, totaling 27 features are jointly integrated into the answer selection task. Experiment result confirms that incorporating prior knowledge can enhances the model, and different levels of linguistic features can improve the performance consistantly.


Deep learning BLSTM Prior knowledge Incorporating Answer selection 



This study was partially funded by National High-tech R&D Program of China (863 Program, No. 2015AA015405), and National Natural Science Foundation of China (Nos. 61370170 and 61402134). Besides, we would like to give many thanks to Shanshan Zhao (HIT) for helping with her BLSTM framework tool, Fangying Wu (HIT) for offering suggestions in extracting those QA-pair level features.


  1. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
  2. Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the Second International Conference on Human Language Technology Research, pp. 138–145. Morgan Kaufmann Publishers Inc. (2002)Google Scholar
  3. Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)CrossRefGoogle Scholar
  4. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)Google Scholar
  5. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  6. Huang, M., Cao, Y., Dong, C.: Modeling rich contexts for sentiment classification with LSTM. arXiv preprint arXiv:1605.01478 (2016)
  7. Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
  8. Duan, N.: Overview of the NLPCC-ICCPOL 2016 shared task: open domain Chinese question answering. In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS (LNAI), vol. 10102, pp. 942–948. Springer, Cham (2016). CrossRefGoogle Scholar
  9. Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)Google Scholar
  10. Poria, S., Cambria, E., Gelbukh, A.: Aspect extraction for opinion mining with a deep convolutional neural network. Knowl.-Based Syst. 108, 42–49 (2016)CrossRefGoogle Scholar
  11. Robertson, S.E., Walker, S., Beaulieu, M.: Proceedings of the 8th Text Retrieval Conference (TREC-8), pp. 77–82 (2000)Google Scholar
  12. Severyn, A., Moschitti, A.: Modeling relational information in question-answer pairs with convolutional neural networks. arXiv preprint arXiv:1604.01178 (2016)
  13. Tan, M., dos Santos, C., Xiang, B., Zhou, B.: LSTM-based deep learning models for non-factoid answer selection. arXiv preprint arXiv:1511.04108 (2015)
  14. Voorhees, E.M., et al.: The TREC-8 question answering track report. In: TREC 1999, pp. 77–82 (1999)Google Scholar
  15. Wang, D., Nyberg, E.: A long short-term memory model for answer sentence selection in question answering. In: ACL, vol. 2, pp. 707–712 (2015)Google Scholar
  16. Wang, P., Xu, B., Xu, J., Tian, G., Liu, C.-L., Hao, H.: Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification. Neurocomputing 174, 806–814 (2016)CrossRefGoogle Scholar
  17. Wu, F., Yang, M., Zhao, T., Han, Z., Zheng, D., Zhao, S.: A hybrid approach to DBQA. In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS (LNAI), vol. 10102, pp. 926–933. Springer, Cham (2016). CrossRefGoogle Scholar
  18. Yan, X., Guo, J., Lan, Y., Cheng, X.: A biterm topic model for short texts. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 1445–1456. ACM (2013)Google Scholar
  19. Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of NAACL-HLT, pp. 1480–1489 (2016)Google Scholar
  20. Yu, L., Hermann, K.M., Blunsom, P., Pulman, S.: Deep learning for answer sentence selection. arXiv preprint arXiv:1412.1632 (2014)

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Yahui Li
    • 1
    Email author
  • Muyun Yang
    • 1
  • Tiejun Zhao
    • 1
  • Dequan Zheng
    • 1
  • Sheng Li
    • 1
  1. 1.Harbin Institute of TechnologyHarbinChina

Personalised recommendations