Skip to main content
Log in

Topic enhanced deep structured semantic models for knowledge base question answering

  • Research Paper
  • Special Focus on Natural Language Processing and Social Computing
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Knowledge Base Question Answering (KBQA) is a hot research topic in natural language processing (NLP). The most challenging problem in KBQA is how to understand the semantic information of natural language questions and how to bridge the semantic gap between the natural language questions and the structured fact triples in knowledge base. This paper focuses on simple questions which can be answered by a single fact triple in knowledge base. We propose a topic enhanced deep structured semantic model for KBQA. The proposed method considers the task of KBQA as a matching problem between questions and the subjects and predicates in knowledge base. And the proposed model consists of two stages to match the subjects and predicates, respectively. In the first stage, we propose a Convolutional based Topic Entity Extraction Model (CTEEM) to extract topic entities mentioned in questions. With the extracted entities, we can retrieve the relevant candidate fact triples from knowledge base and obviously decrease the amount of noising candidates. In the second stage, we employ Deep Structured Semantic Models (DSSMs) to compute the semantic relevant score between questions and predicates in the candidates. And we combine the semantic level and the lexical level scores to rank the candidates. We evaluate the proposed method on KBQA dataset released by NLPCC-ICCPOL 2016. The experimental results show that our proposed method achieves the third place among the 21 submitted systems. Furthermore, we also extend the DSSM by using BiLSTM and integrate a convolutional structure on the top of BiLSTM layers. Our experimental results show that the extension models can further improve the performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Bollacker K, Evans C, Paritosh P, et al. Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, Vancouver, 2008. 1247–1250

    Chapter  Google Scholar 

  2. Auer S, Bizer C, Kobilarov G, et al. Dbpedia: a nucleus for a web of open data. In: Proceedings of the 6th International the Semantic Web and 2nd Asian Conference on Asian Semantic Web, Busan, 2007. 722–735

    Google Scholar 

  3. Berant J, Chou A, Frostig R, et al. Semantic parsing on freebase from question-answer pairs. Proc EMNLP, 2013, 2: 1533–1544

    Google Scholar 

  4. Berant J, Liang P. Semantic parsing via paraphrasing. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Stroudsburg, 2014. 1415–1425

    Google Scholar 

  5. Bordes A, Usunier N, Chopra S, et al. Large-scale simple question answering with memory networks. Comput Sci, 2015, arXiv:1506.02075

    Google Scholar 

  6. Bordes A, Chopra S, Weston J. Question answering with subgraph embeddings. Comput Sci, 2014, arXiv:1406.3676

    Google Scholar 

  7. Bordes A, Weston J, Usunier N. Open question answering with weakly supervised embedding models. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Nancy, 2014. 165–180

    Google Scholar 

  8. Yao X, Durme B V. Information extraction over structured data: question answering with freebase. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Stroudsburg, 2014. 956–966

    Google Scholar 

  9. Shen Y, He X, Gao J, et al. Learning semantic representations using convolutional neural networks for web search. In: Proceedings of the 23rd International Conference on World Wide Web, Seoul, 2014. 373–374

    Google Scholar 

  10. Zettlemoyer L S, Collins M. Learning to map sentences to logical form: structured classification with probabilistic categorial grammars. In: Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence, Edinburgh, 2012

    Google Scholar 

  11. Kwiatkowski T, Zettlemoyer L, Goldwater S, et al. Inducing probabilistic CCG grammars from logical form with higherorder unification. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Cambridge, 2010. 1223–1233

    Google Scholar 

  12. Liang P, Jordan M I, Klein D. Learning dependency-based compositional semantics. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, Stroudsburg, 2011. 590–599

    Google Scholar 

  13. Cai Q, Yates A. Large-scale semantic parsing via schema matching and lexicon extension. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, 2013. 423–433

    Google Scholar 

  14. Lai Y, Lin Y, Chen J, et al. Open domain question answering system based on knowledge base. In: Proceedings of the 24th International Conference on Computer Processing of Oriental Languages, Kunming, 2016. 722–733

    Google Scholar 

  15. Wang L, Zhang Y, Liu T. A deep learning approach for question answering over knowledge base. In: Proceedings of the 24th International Conference on Computer Processing of Oriental Languages, Kunming, 2016. 885–892

    Google Scholar 

  16. Yang F, Gan L, Li A, et al. Combining deep learning with information retrieval for question answering. In: Proceedings of the 24th International Conference on Computer Processing of Oriental Languages, Kunming, 2016. 917–925

    Google Scholar 

  17. Xie Z, Zeng Z, Zhou G, et al. Knowledge base question answering based on deep learning models. In: Proceedings of the 24th International Conference on Computer Processing of Oriental Languages, Kunming, 2016. 300–311

    Google Scholar 

  18. Pennington J, Socher R, Manning C D. Glove: global vectors for word representation. Proc EMNLP, 2014, 14: 1532–1543

    Google Scholar 

  19. Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. Adv Neural Inform Process Syst, 2013, 26: 3111–3119

    Google Scholar 

  20. Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space. Comput Sci, 2013, arXiv:1301.3781

    Google Scholar 

  21. Zhou G Y, He T T, Zhao J, et al. Learning continuous word embedding with metadata for question retrieval in community question answering. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, 2015. 250–259

    Google Scholar 

  22. Zhou G Y, Huang X J. Modeling and learning distributed word representation with metadata for question retrieval. IEEE Trans Knowl Data Eng, 2017, 29: 1226–1239

    Article  Google Scholar 

  23. Huang P S, He X, Gao J, et al. Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, San Francisco, 2013. 2333–2338

    Google Scholar 

  24. Shen Y, He X, Gao J, et al. A latent semantic model with convolutional-pooling structure for information retrieval. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management, Shanghai, 2014. 101–110

    Google Scholar 

  25. Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. Comput Sci, 2014, arXiv:1409.0473

    Google Scholar 

  26. Rush A M, Chopra S, Weston J. A neural attention model for abstractive sentence summarization. Proc EMNLP, 2015

    Google Scholar 

  27. Yih W, Chang M W, He X, et al. Semantic parsing via staged query graph generation: question answering with knowledge base. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Beijing, 2015

    Google Scholar 

  28. Yih W T, He X, Meek C. Semantic parsing for single-relation question answering. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, 2014. 643–648

    Google Scholar 

  29. Dong L, Wei F, Zhou M, et al. Question answering over freebase with multi-column convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Beijing, 2015. 260–269

    Google Scholar 

  30. Zhang Y, Liu K, He S, et al. Question answering over knowledge base with neural attention combining global knowledge information. Comput Sci, 2016, arXiv:1606.00979

    Google Scholar 

  31. Jain S. Question answering over knowledge base using factual memory networks. In: Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), San Diego, 2016. 109–115

    Google Scholar 

  32. Dai Z H, Li L, Xu W. Cfo: conditional focused neural question answering with large-scale knowledge bases. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, 2016

    Google Scholar 

  33. Lafferty J, McCallum A, Pereira F. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning, San Francisco, 2001. 282–289

    Google Scholar 

  34. Palangi H, Deng L, Shen Y, et al. Semantic modelling with long-short-term memory for information retrieval. Comput Sci, 2014, arXiv:1412.6629

    Google Scholar 

  35. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput, 1997, 9: 1735–1780

    Article  Google Scholar 

  36. Gers F A, Schmidhuber J, Cummins F. Learning to forget: continual prediction with LSTM. Neural Comput, 2000, 12: 2451–2471

    Article  Google Scholar 

  37. Gers F A, Schraudolph N N, Schmidhuber J. Learning precise timing with LSTM recurrent networks. J Mach Learn Res, 2002, 3: 115–143

    MathSciNet  MATH  Google Scholar 

  38. Graves A, Mohamed A, Hinton G. Speech recognition with deep recurrent neural networks. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, 2013. 6645–6649

    Google Scholar 

  39. Duan N. Overview of the NLPCC-ICCPOL 2016 shared task: open domain chinese question answering. In: Proceedings of the 24th International Conference on Computer Processing of Oriental Languages, Kunming, 2016. 942–948

    Google Scholar 

  40. Kingma D, Ba J. Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference for Learning Representations, San Diego, 2014

    Google Scholar 

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant Nos. 61573163, 71571084), Fundamental Research Funds for the Central Universities (Grant No. CCNU16A02024), and Wuhan Youth Science and Technology Plan.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guangyou Zhou.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xie, Z., Zeng, Z., Zhou, G. et al. Topic enhanced deep structured semantic models for knowledge base question answering. Sci. China Inf. Sci. 60, 110103 (2017). https://doi.org/10.1007/s11432-017-9136-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-017-9136-x

Keywords

Navigation