Skip to main content
Log in

BertHANK: hierarchical attention networks with enhanced knowledge and pre-trained model for answer selection

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

In recent years, Community Question Answering (CQA) becomes increasingly prevalent, because it provides platforms for users to collect information and share knowledge. However, given a question in a CQA system, there are often many different paired answers. It is almost impossible for users to view them item by item and select the most relevant one. Hence, answer selection becomes an important task of CQA. In this paper, we propose a novel solution - BertHANK, which is a hierarchical attention networks with enhanced knowledge and pre-trained model for answer selection. Specifically, in the encoding stage, knowledge enhancement and pre-training model are used for questions and answers, respectively. Further, we adopt multi-attention mechanism, including the cross-attention on question-answer pairs, the inner attention on questions at word level, and the hierarchical inner attention on answers at both word and sentence level, to capture more subtle semantic features. In more details, the cross-attention focuses on capturing interactive information among encoded questions and answers. While the hierarchical inner attention assigns different weights to words in sentences, and sentences in answers, thereby obtaining both global and local information of question-answer pairs. The hierarchical inner attention contributes to select out best-matched answers for specific questions. Finally, we integrate attention-questions and attention-answers to make prediction. The results show that our model achieves state-of-the-art performance on two corpora, SemEval-2015 and SemEval-2017 CQA datasets, outperforming the advanced baselines by a large margin.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Nie L, Wang M, Zhang L, Yan S, Zhang B, Chua TS (2015) Disease inference from health-related questions via sparse deep learning. TKDE 27(8):2107–2119

    Google Scholar 

  2. Yuan S, Zhang Y, Tang J, Hall W, Cabot JB (2019) Expert finding in community question answering: a review. Artif Intell Rev 2:4470

    Google Scholar 

  3. Nie L, Wei X, Zhang D, Wang X, Gao Z, Yang Y (2017) Data-driven answer selection in community qa systems. In: IEEE transactions on knowledge and data engineering, pp 1186–1198

  4. Tymoshenko K, Moschitti A (2015) Assessing the impact of syntactic and semantic structures for answer passages reranking. In: 24th ACM international on conference on information and knowledge management, pp 1451–1460

  5. Yih SWT, Chang M-W, Meek C, Pastusiak A (2013) Question answering using enhanced lexical semantic models. In: 51st annual meeting of the association for computational linguistics (ACL), pp 1744–1753

  6. Cao X, Cong G, Cui B, Jensen CS (2010) A generalized framework of exploring category information for question retrieval in community question answer archives. In: 19th international conference on World wide web. ACM, pp 201–210

  7. Cao X, Cong G, Cui B, Jensen CS, Zhang C (2009) The use of categorization information in language models for question retrieval. In: 18th ACM conference on information and knowledge management, pp 265–274

  8. Duan H, Cao Y, Lin C-Y, Yu Y (2008) Searching questions by identifying question topic and question focus. In: ACL-08: HLT, pp 156–164

  9. Jeon J, Croft WB, Lee JH (2005) Finding semantically similar questions based on their answers. In: SIGIR. ACM, pp 617–618

  10. Jeon J, Croft WB, Lee JH (2005) Finding similar questions in large question and answer archives. In: 14th ACM international conference on Information and knowledge management, pp 84–90

  11. Xiang Y, Chen Q, Wang X, Qin Y (2017) Answer selection in community question answering via attentive neural networks. IEEE Signal Process Lett 24(4):505–509

    Article  Google Scholar 

  12. Xiang Y, Zhou X, Chen Q, Zheng Z, Tang B, Wang X, Qin Y (2016) Incorporating label dependency for answer quality tagging in community question answering via cnn-lstm-crf. In: 26th international conference on computational linguistics, pp 1231–1241

  13. Tan M, dos Santos C, Xiang B, Zhou B (2015) Lstm-based deep learning models for non-factoid answer selection. arXiv:1511.04108

  14. Tau YW, Ming-Wei C, Christopher M, Andrzej P (2013) Question answering using enhanced lexical semantic models. In: Annual meeting of the association for computational linguistics

  15. Riloff E, Thelen M (2012) A rule-based question answering system for reading comprehension tests. Science 2:447

    Google Scholar 

  16. Yih WT, Chang MW, Meek C, Pastusiak A (2013) Question answering using enhanced lexical semantic models. Meeting Assoc Comput Linguist 2:1744–1753

    Google Scholar 

  17. Wang M, Manning CD (2010) Probabilistic tree-edit models with structured latent variables for textual entailment and question answering. Int Conf Comput Linguist 5:1164–1172

    Google Scholar 

  18. Wang M, Smith NA, Mitamura T (2007) What is the jeopardy model? a quasisynchronous grammar for qa. In: EMNLP-CoNLL

  19. Smith DA, Eisner J (2006) Quasi-synchronous grammars: alignment by soft projection of syntactic dependencies. In: Workshop on statistical machine translation

  20. Heilman M, Smith NA (2010) Tree edit models for recognizing textual entailments, paraphrases, and answers to questions. In: NAACL

  21. Yao X, Van Durme B, Clark P (2013) Answer extraction as sequence tagging with tree edit distance. In: NAACL-HLT

  22. Feng M, Xiang B, Glass MR, Wang L, Zhou B (2015) Applying deep learning to answer selection: A study and an open task. page 813–820. In: IEEE workshop on automatic speech recognition and understanding (ASRU)

  23. Surdeanu M, Ciaramita M, Zaragoza H (2011) Learning to rank answers to non-factoid questions from web collections. Comput Linguist 37(2):351–383

    Article  Google Scholar 

  24. Robertson S, Zaragoza H et al (2009) The probabilistic relevance framework: Bm25 and beyond. Found Trends Inf Retr 3(4):333–389

    Article  Google Scholar 

  25. Chao L (2016) Research and application on intelligent disease guidance and medical question answering method. MSc. thesis, Dalian University of Technology

  26. Textrank MR (2004) Bringing order into text. 2004 conference on empirical methods in natural language processing. Science 2:11554

    Google Scholar 

  27. Moschitti A, Quarteroni S (2011) Linguistic kernels for answer re-ranking in question answering systems. Inf Process Manag 47(6):825–842

    Article  Google Scholar 

  28. Yu L, Hermann KM, Blunsom P, Pulman S (2014) Deep learning for answer sentence selection. CoRR arXiv:1412.1632

  29. Kim Y (2014) Convolutional neural networks for sentence classification. In: Conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, pp 1746–1751

  30. Severyn A, Moschitti A (2015) Learning to rank short text pairs with convolutional deep neural networks. Int ACM SIGIR Conf 5:373–382

    Google Scholar 

  31. Feng M et al (2015) Applying deep learning to answer selection: a study and an open task. In: IEEE workshop on automatic speech recognition and understanding, ASRU

  32. Tymoshenko K, Bonadiman D, Moschitti A (2016) Convolutional neural networks vs. convolution kernels: feature engineering for answer sentence reranking. In: 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1268–1278

  33. Wang D, Nyberg E (2015) A long short-term memory model for answer sentence selection in question answering. Int Joint Conf Nat Lang Process Meeting Assoc Comput Linguist 3:707–712

    Google Scholar 

  34. Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232

    Article  MathSciNet  Google Scholar 

  35. Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw 18(5–6):602–610

    Article  Google Scholar 

  36. Tan M, Xiang B, Dos Santos C, Zhou B (2016) Improved representation learning for question answer matching. In: 54th annual meeting of the association for computational linguistics, pp 464–473

  37. Yin W, Yu M, Xiang B, Zhou B, Schtze H (2016) Simple question answering by attentive convolutional neural network. arXiv preprint arXiv:1606.03391

  38. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: International conference on learning representations

  39. Zhang X, Li S, Sha L, Wang H (2017) Attentive interactive neural networks for answer selection in community question answering. In: 31st AAAI conference on artificial intelligence, pp 3525–3531

  40. Wang S, Jiang J (2017) A compare-aggregate model for matching text sequences. In: International conference on learning representations

  41. Tan M, Xiang B, Zhou B (2015) Lstm based deep learning models for non-factoid answer selection. arXiv preprint arXiv:1511.04108

  42. Tran NK, Niedereée C (2018) Multihop attention networks for question answer matching. In: International ACM SIGIR conference on research and development in information retrieval

  43. Zhang X, Li S, Sha L, Wang H (2017) Attentive interactive neural networks for answer selection in community question answering. In: 31st AAAI conference on artificial intelligence, pp 3525–3531

  44. Weston J, Chopra S, Bordes A (2014) Memory networks. arXiv preprint arXiv:1410.3916

  45. Deng Y, Xie Y, Li Y, Yang M, Nan D, Fan W, Lei K, Shen Y (2019) Multi-task learning with multi-view attention for answer selection and knowledge base question answering. Proc AAAI Conf Artif Intell 33:6318–6325

    Google Scholar 

  46. Savenkov D, Agichtein E (2017) Evinets: neural networks for combining evidence signals for factoid question answering. In: Proceedings of the 55th annual meeting of the association for computational linguistics (vol 2: Short Papers), pp 299–304

  47. Deng Y, Xie Y, Li Y, Yang M, Lam W, Shen Y (2021) Contextualized knowledge-aware attentive neural network: enhancing answer selection with knowledge. ACM Trans Inf Syst 40(1):4478

    Google Scholar 

  48. Bordes A, Usunier N, Garcia-Durán A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Proceedings of the 26th international conference on neural information processing systems-vol 2, NIPS’13, pp 2787–2795

  49. Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: NAACL, pp 4171–4186

  50. Yu AW, Dohan D, Le Q, Luong T et al. (2018) Fast and accurate reading comprehension by combining self-attention and convolution. In: International conference on learning representations

  51. Lin Z, Feng M, dos Santos CN, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. In: International conference on learning representations

  52. Zhu M, Ahuja A, Wei W, Reddy CK (2010) A hierarchical attention retrieval model for healthcare question answering. In: World Wide Web conference, pp 2472–2482

  53. Tran QH, Tran D-V, Vu T, Nguyen ML, Pham SB (2015) Jaist: combining multiple features for answer selection in community question answering. In: 9th international workshop on semantic evaluation (SemEval2015), pp 215–219

  54. Hou Y, Tan C, Wang X, Zhang Y, Xu J, Chen Q (2015) Hitsz-icrc: exploiting classification approach for answer selection in community question answering. In: 9th international workshop on semantic evaluation (SemEval 2015), pp 196–202

  55. Joty S, Barrón-Cedeño A, Da San Martino G, Filice S, Màrquez L, Moschitti A, Nakov P (2015) Global thread-level inference for comment classification in community question answering. In: EMNLP, pp 573–578

  56. Joty S, Màrquez L, Nakov P (2016) Joint learning with global inference for comment classification in community question answering. In: 2016 conference of the North American chapter of the association for computational lin-guistics: human language technologies, pp 703–713

  57. Wu W, Wang H, Li S (2017) Bi-directional gated memory networks for answer selection. In: Chinese computational linguistics and natural language processing based on naturally annotated big data, pp 251–262

  58. Wu G, Sheng Y, Lan M, Wu Y (2017) Ecnu at semeval-2017 task 3: using traditional and deep learning methods to address community question answer-ing task. In 11th international workshop on semantic evaluation (SemEval-2017), pp 365–369

  59. Xiang Y, Zhou X, Chen Q et al. (2016) Incorporating label dependency for answer quality tagging in community question answering via cnn-lstm-crf. In: COLING 2016, 26th international conference on computational linguistics, pp 1231–1241

  60. Wu W, Sun X, Wang H et al. (2018) Question condensing networks for answer selection in community question answering. In: Association for computational linguistics, pp 1746–1755

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Haitian Yang or Yan Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, H., Zhao, X., Wang, Y. et al. BertHANK: hierarchical attention networks with enhanced knowledge and pre-trained model for answer selection. Knowl Inf Syst 64, 2189–2213 (2022). https://doi.org/10.1007/s10115-022-01703-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-022-01703-7

Keywords

Navigation