Skip to main content

Open Domain Question Answering System Based on Knowledge Base

  • Conference paper
  • First Online:
Natural Language Understanding and Intelligent Applications (ICCPOL 2016, NLPCC 2016)

Abstract

Aiming at the task of open domain question answering based on knowledge base in NLP&CC 2016, we propose a SPE (subject predicate extraction) algorithm which can automatically extract a subject-predicate pair from a simple question and translate it to a KB query. A novel method based on word vector similarity and predicate attention is used to score the candidate predicate after a simple topic entity linking method. Our approach achieved the F1-score of 82.47% on test data which obtained the first place in the contest of NLP&CC 2016 Shared Task 2 (KBQA sub-task). Furthermore, there are also a series of experiments and comprehensive error analysis which can show the properties and defects of the new data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://code.google.com/archive/p/word2vec.

References

  1. Kwok, C.C.T., Etzioni, O., Weld, D.S.: Scaling question answering to the Web. In: Proceedings of the 10th International Conference on World Wide Web (2001)

    Google Scholar 

  2. Brill, E., Lin, J., Banko, M., Dumais, S., Ng, A.: Data-intensive question answering. In: Proceedings of TREC (2001)

    Google Scholar 

  3. Tsai, C.-T., Yih, W.-T., Burges, C.J.C.: Web-based question answering: revisiting AskMSR. Technical report MSR-TR-2015-20, Microsoft Research (2015)

    Google Scholar 

  4. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). doi:10.1007/978-3-540-76298-0_52

    Chapter  Google Scholar 

  5. Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, pp. 1247–1250 (2008)

    Google Scholar 

  6. Berant, J., Chou, A., Frostig, R., Liang, P.: Semantic parsing on freebase from question-answer pairs. In: Proceedings of EMNLP (2013)

    Google Scholar 

  7. Liang, P., Jordan, M., Klein, D.: Learning dependency-based compositional semantics. In: Proceedings of ACL (2011)

    Google Scholar 

  8. Kwiatkowski, T., Zettlemoyer, L., Goldwater, S., Steedman, M.: Lexical generalization in CCG grammar induction for semantic parsing. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2011)

    Google Scholar 

  9. Yao, X., Van Durme, B.: Information extraction over structured data: question answering with freebase. In: Proceedings of ACL (2014)

    Google Scholar 

  10. Yao, X., Berant, J., Van Durme, B.: Freebase QA: information extraction or semantic parsing? In: Proceedings of ACL (2014)

    Google Scholar 

  11. Yih, W.-T., Chang, M.-W., He, X., Gao, J.: Semantic parsing via staged query graph generation: question answering with knowledge base. In: Proceedings of ACL Association for Computational Linguistics (2015)

    Google Scholar 

  12. Ye, Z., Jia, Z., Yang, Y., Huang, J., Yin, H.: Research on open domain question answering system. In: Li, J., Ji, H., Zhao, D., Feng, Y. (eds.) NLPCC 2015. LNCS (LNAI), vol. 9362, pp. 527–540. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25207-0_49

    Chapter  Google Scholar 

  13. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR (2013)

    Google Scholar 

  14. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS (2013)

    Google Scholar 

  15. Mikolov, T., Yih, W.-T., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of NAACL HLT (2013)

    Google Scholar 

  16. Junwei, B., Nan, D., Ming, Z., Tiejun, Z.: Knowledge-based question answering as machine translation. In: Proceedings of ACL (2014)

    Google Scholar 

Download references

Acknowledgement

We would like to thank members in our NLP group and the anonymous reviewers for their helpful feedback. This work was supported by National High Technology R&D Program of China (Grant No. 2015AA015403, 2014AA015102), Natural Science Foundation of China (Grant No. 61202233, 61272344, 61370055) and the joint project with IBM Research. Any correspondence please refer to Yansong Feng.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yansong Feng .

Editor information

Editors and Affiliations

Appendices

Appendix A

The 8 regular expressions shown in Table 8 are used to capture the non-core parts. They are executed in order.

Table 8. Regular expressions for core question extraction

Appendix B

The rules to clean KB are shown in Table 9.

Table 9. KB cleaning rules

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Lai, Y., Lin, Y., Chen, J., Feng, Y., Zhao, D. (2016). Open Domain Question Answering System Based on Knowledge Base. In: Lin, CY., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds) Natural Language Understanding and Intelligent Applications. ICCPOL NLPCC 2016 2016. Lecture Notes in Computer Science(), vol 10102. Springer, Cham. https://doi.org/10.1007/978-3-319-50496-4_65

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50496-4_65

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50495-7

  • Online ISBN: 978-3-319-50496-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics