Advertisement

Knowledge and Information Systems

, Volume 55, Issue 3, pp 529–569 | Cite as

Core techniques of question answering systems over knowledge bases: a survey

  • Dennis Diefenbach
  • Vanessa Lopez
  • Kamal Singh
  • Pierre Maret
Survey Paper

Abstract

The Semantic Web contains an enormous amount of information in the form of knowledge bases (KB). To make this information available, many question answering (QA) systems over KBs were created in the last years. Building a QA system over KBs is difficult because there are many different challenges to be solved. In order to address these challenges, QA systems generally combine techniques from natural language processing, information retrieval, machine learning and Semantic Web. The aim of this survey is to give an overview of the techniques used in current QA systems over KBs. We present the techniques used by the QA systems which were evaluated on a popular series of benchmarks: Question Answering over Linked Data. Techniques that solve the same task are first grouped together and then described. The advantages and disadvantages are discussed for each technique. This allows a direct comparison of similar techniques. Additionally, we point to techniques that are used over WebQuestions and SimpleQuestions, which are two other popular benchmarks for QA systems.

Keywords

Question answering QALD WebQuestions SimpleQuestions Survey Semantic Web Knowledge base 

Notes

Acknowledgements

This project has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie Grant agreement No. 642795.

References

  1. 1.
    Abujabal A, Yahya M, Riedewald M, Weikum G (2017) Automated template generation for question answering over knowledge graphs. In: Proceedings of the 26th international conference on world wide web. pp 1191–1200Google Scholar
  2. 2.
    Aggarwal N, Buitelaar P (2012) A system description of natural language query over dbpedia. In: Proceedings of interacting with linked data (ILD 2012)[37]Google Scholar
  3. 3.
    Allam AM, Haggag MH (2012) The question answering systems: a survey. Int J Res Rev Inf Sci (IJRRIS) 2(3)Google Scholar
  4. 4.
    Atzori M, Mazzeo G, Zaniolo C (2016) QA3@QALD-6: Statistical Question Answering over RDF cubes, In: ESWC. to appear Google Scholar
  5. 5.
    Bao J, Duan N, Zhou M, Zhao T (2014) Knowledge-based question answering as machine translation. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, vol 1. Association for Computational Linguistics, Baltimore, pp 967–976. http://www.aclweb.org/anthology/P/P14/P14-1091
  6. 6.
    Bast H, Haussmann E (2015) More accurate question answering on freebase. In: Proceedings of the 24th ACM international on conference on information and knowledge management, ACMGoogle Scholar
  7. 7.
    Baudiš P, Šedivỳ J (2015) QALD challenge and the YodaQA system: prototype notesGoogle Scholar
  8. 8.
    Beaumont R, Grau B, Ligozat A-L (2015) SemGraphQA@QALD-5: LIMSI participation at QALD-5@CLEF. In: Working notes for CLEF 2015 conference, CLEFGoogle Scholar
  9. 9.
    Berant J, Chou A, Frostig R, Liang P (2013) Semantic parsing on freebase from question-answer pairs. In: EMNLPGoogle Scholar
  10. 10.
    Berant J, Liang P (2014) Semantic parsing via paraphrasing. In: ACL (1)Google Scholar
  11. 11.
    Berant J, Liang P (2015) Imitation learning of agenda-based semantic parsers. Trans Assoc Comput Linguist 3:545–558Google Scholar
  12. 12.
    Bordes A, Chopra S, Weston J (2014) Question answering with subgraph embeddings, arXiv preprint arXiv:1406.3676
  13. 13.
    Bordes A, Usunier N., Chopra S, Weston J (2015) Large-scale simple question answering with memory networks. arXiv preprint arXiv:1506.02075
  14. 14.
    Both A, Diefenbach D, Singh K, Shekarpour S, Cherix D, Lange C (2016) Qanary–a methodology for vocabulary-driven open question answering systems. In: International semantic web conference, SpringerGoogle Scholar
  15. 15.
    Cabrio E, Cojan J, Aprosio AP, Magnini B, Lavelli A, Gandon F (2012) QAKiS: an open domain QA system based on relational patterns. In: Proceedings of the 2012th international conference on posters & demonstrations track-volume 914, CEUR-WS. orgGoogle Scholar
  16. 16.
    Cimiano P, Lopez V, Unger C, Cabrio E, Ngomo A-CN, Walter S (2013) Multilingual question answering over linked data (qald-3): Lab overview. In: Information access evaluation. Multilinguality, multimodality, and visualization, SpringerGoogle Scholar
  17. 17.
    Cimiano P, Minock M (2009) Natural language interfaces: what is the problem?-a data-driven quantitative analysis. In: NLDB, Springer, pp 192–206Google Scholar
  18. 18.
    Clarke D (2015) Simple, fast semantic parsing with a tensor kernel, arXiv preprint arXiv:1507.00639
  19. 19.
    Cunningham H, Maynard D, Bontcheva K, Tablan V (2002) GATE: a framework and graphical development environment for robust NLP tools and applications. In: Proceedings of the 40th anniversary meeting of the association for computational linguistics (ACL’02)Google Scholar
  20. 20.
    Dai Z, Li L, Xu W (2016) , ‘Cfo: Conditional focused neural question answering with large-scale knowledge bases, arXiv preprint arXiv:1606.01994
  21. 21.
    Daiber J, Jakob M, Hokamp C, Mendes PN (2013) Improving efficiency and accuracy in multilingual entity extraction. In: Proceedings of the 9th international conference on semantic systems, ACMGoogle Scholar
  22. 22.
    Damljanovic D, Agatonovic M, Cunningham H (2010) Identification of the question focus: combining syntactic analysis and ontology-based lookup through the user interaction. In: LRECGoogle Scholar
  23. 23.
    Damljanovic D, Agatonovic M, Cunningham H (2012) FREyA: an interactive way of querying linked data using natural language. In: The semantic web: ESWC 2011 workshops, SpringerGoogle Scholar
  24. 24.
    Diefenbach D, Amjad S, Both A, Singh K, Maret P (2017) Trill: a reusable front-end for qa systems. In: ESWC P&DGoogle Scholar
  25. 25.
    Diefenbach D, Singh K, Both A, Cherix D, Lange C, Auer S (2017) The qanary ecosystem: getting new insights by composing question answering pipelines. In: ICWEGoogle Scholar
  26. 26.
    Dima C (2013) Intui2: a prototype system for question answering over linked data. In: Proceedings of the question answering over linked data lab (QALD-3) at CLEFGoogle Scholar
  27. 27.
    Dima C (2014) Answering natural language questions with Intui3. In: Conference and labs of the evaluation forum (CLEF)Google Scholar
  28. 28.
    Dong L, Wei F, Zhou M, Xu K (2015) Question answering over freebase with multi-column convolutional neural networks. In: ACL (1)Google Scholar
  29. 29.
    Dwivedi SK, Singh V (2013) Research and reviews in question answering system. Proc Technol 10:417–424CrossRefGoogle Scholar
  30. 30.
    Fader A, Soderland S, Etzioni O (2011) Identifying relations for open information extraction. In: Proceedings of the conference on empirical methods in natural language processing, association for computational linguisticsGoogle Scholar
  31. 31.
    Fader A, Zettlemoyer LS, Etzioni O (2013) Paraphrase-driven learning for open question answering. In: ACL (1), CiteseerGoogle Scholar
  32. 32.
    Ferrandez O, Spurk C, Kouylekov M, Dornescu I, Ferrandez S, Negri M, Izquierdo R, Tomas D, Orasan C, Neumann G et al (2011) The QALL-ME framework: a specifiable-domain multilingual question answering architecture. Web Semant Sci Serv Agents World Wide Web 9(2):137–145CrossRefGoogle Scholar
  33. 33.
    Ferré S (2013) squall2sparql: a Translator from Controlled English to Full SPARQL 1.1. In: Work. Multilingual question answering over linked data (QALD-3)Google Scholar
  34. 34.
    Ferré S (2017) Sparklis: an expressive query builder for sparql endpoints with guidance in natural language. Semant Web 8(3):405–418CrossRefGoogle Scholar
  35. 35.
    Freitas A, Curry E (2014) Natural language queries over heterogeneous linked data graphs: a distributional-compositional semantics approach. In: Proceedings of the 19th international conference on intelligent user interfaces, ACMGoogle Scholar
  36. 36.
    Freitas A, Curry E, Oliveira JG, O’Riain S (2012) Querying heterogeneous datasets on the linked data web: challenges, approaches, and trends. IEEE Internet Comput 16(1):24–33. doi: 10.1109/MIC.2011.141
  37. 37.
    Freitas A, Efson Sales J, Handschuh S, Curry E (2015) How hard is this query? Measuring the semantic complexity of schema-agnostic queries. In: Proceedings of the 11th international conference on computational semantics, association for computational linguistics, London, UKGoogle Scholar
  38. 38.
    Gerber D, Ngomo A-CN (2011) Bootstrapping the linked data web. In: 1st workshop on web scale knowledge extraction@ ISWC, Vol. 2011Google Scholar
  39. 39.
    Giannone C, Bellomaria V, Basili R (2013) A HMM-based approach to question answering against linked data. In: Proceedings of the question answering over linked data lab (QALD-3) at CLEFGoogle Scholar
  40. 40.
    Golub D, He X (2016) Character-level question answering with attention, arXiv preprint arXiv:1604.00727
  41. 41.
    Google (2016) Freebase data dumps, https://developers.google.com/freebase/data
  42. 42.
    Hakimov S, Unger C, Walter S, Cimiano P (2015) Applying semantic parsing to question answering over linked data: addressing the lexical gap. In: Natural language processing and information systems, SpringerGoogle Scholar
  43. 43.
    Hamon T, Grabar N, Mougin F, Thiessard F (2014) Description of the POMELO System for the Task 2 of QALD-2014. In: CLEF (Working Notes)Google Scholar
  44. 44.
    He S, Zhang Y, Liu K, Zhao J (2014) CASIA@ V2: a MLN-based question answering system over linked data. In: Proceedings of QALD-4Google Scholar
  45. 45.
    Höffner K, Lehmann J, Usbeck R (2016) CubeQA—Question Answering on RDF Data Cubes. In: Groth P, Simperl E, Gray A, Sabou M, Krötzsch M, Lecue F, Flöck F, Gil Y (eds) The Semantic Web—ISWC 2016: 15th International Semantic Web Conference, Kobe, Japan, October 17–21, 2016, Proceedings, Part I. Springer, Cham, pp 325–340. doi: 10.1007/978-3-319-46523-4_20
  46. 46.
    Höffner K, Walter S, Marx E, Usbeck R, Lehmann J, Ngonga Ngomo A-C (2016) Survey on challenges of question answering in the semantic web. Semant Web JGoogle Scholar
  47. 47.
    Jain S (2016) Question answering over knowledge base using factual memory networks. In: Proceedings of NAACL-HLTGoogle Scholar
  48. 48.
    Joris G, Ferré S (2013) Scalewelis: a scalable query-based faceted search system on top of sparql endpoints. In: Work. Multilingual question answering over linked data (QALD-3)Google Scholar
  49. 49.
    Kolomiyets O, Moens M-F (2011) A survey on question answering technology from an information retrieval perspective. Inf Sci 181(24):5412–5434MathSciNetCrossRefGoogle Scholar
  50. 50.
    Lopez V, Fernández M, Motta E, Stieler N (2012) Poweraqua: supporting users in querying and exploring the semantic web. Semant Web 3(3):249–265Google Scholar
  51. 51.
    Lopez V, Tommasi P, Kotoulas S, Wu J (2016) Queriodali: question answering over dynamic and linked knowledge graphs. In: International semantic web conference, Springer, pp 363–382Google Scholar
  52. 52.
    Lopez V, Unger C, Cimiano P, Motta E (2013) Evaluating question answering over linked data. Web Semant Sci Serv Agents World Wide Web 21(Supplement C):3–13. doi: 10.1016/j.websem.2013.05.006
  53. 53.
    Lopez V, Uren V, Motta E, Pasin M (2007) Aqualog: an ontology-driven question answering system for organizational semantic intranets. Web Semant Sci Serv Agents World Wide Web 5(2):72–105CrossRefGoogle Scholar
  54. 54.
    Lopez V, Uren V, Sabou M, Motta E (2011) Is question answering fit for the semantic web? a survey. Semant Web 2(2):125–155Google Scholar
  55. 55.
    Lukovnikov D, Fischer A, Lehmann J, Auer S (2017) Neural network-based question answering over knowledge graphs on word and character level. In: Proceedings of the 26th international conference on world wide web, international world wide web conferences steering committee, pp 1211–1220Google Scholar
  56. 56.
    Mahendra R, Wanzare L, Bernardi R, Lavelli A, Magnini B (2011) Acquiring relational patterns from wikipedia: a case study. In: Proceedings of the 5th language and technology conferenceGoogle Scholar
  57. 57.
    Marginean A (2017) Question answering over biomedical linked data with grammatical framework. Semant Web 8(4):565–580CrossRefGoogle Scholar
  58. 58.
    Marx E, Usbeck R, Ngomo A-CN, Höffner K, Lehmann J, Auer S (2014) Towards an open question answering architecture. In: Proceedings of the 10th international conference on semantic systems, ACMGoogle Scholar
  59. 59.
    Mazzeo GM, Zaniolo C (2016) Answering controlled natural language questions on RDF knowledge bases EDBT. 608–611Google Scholar
  60. 60.
    Nakashole N, Weikum G, Suchanek F (2012) PATTY: a taxonomy of relational patterns with semantic types. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning, association for computational linguisticsGoogle Scholar
  61. 61.
    Park S, Shim H, Lee GG (2014) ISOFT at QALD-4: semantic similarity-based question answering system over linked data. In: CLEFGoogle Scholar
  62. 62.
    Pouran-ebn veyseh A (2016) Cross-lingual question answering using profile HMM & unified semantic space. In: ESWC. to appear Google Scholar
  63. 63.
    Pradel C, Haemmerlé O, Hernandez N (2012) A semantic web interface using patterns: the SWIP system. In: Graph structures for knowledge representation and reasoning, SpringerGoogle Scholar
  64. 64.
    Reddy S, Lapata M, Steedman M (2014) Large-scale semantic parsing without question-answer pairs. Trans Assoc Comput Linguist 2:377–392Google Scholar
  65. 65.
    Reddy S, Täckström O, Collins M, Kwiatkowski T, Das D, Steedman M, Lapata M (2016) Transforming dependency structures to logical forms for semantic parsing. Trans Assoc Comput Linguist 4:127–140Google Scholar
  66. 66.
    Ruseti S, Mirea A, Rebedea T, Trausan-Matu S (2015) QAnswer-enhanced entity matching for question answering over linked data. In: CLEF (Working Notes), CLEFGoogle Scholar
  67. 67.
    Shekarpour S, Marx E, Ngomo A-CN, Auer S (2015) Sina: semantic interpretation of user queries for question answering on interlinked data. Web Semant Sci Serv Agents World Wide Web 30(Supplement C):39–51. doi: 10.1016/j.websem.2014.06.002
  68. 68.
    Song D, Schilder F, Smiley C, Brew C, Zielund T, Bretz H, Martin, R., Dale C, Duprey J, Miller T et al. (2015) TR discover: a natural language interface for querying and analyzing interlinked datasets. In: The semantic web-ISWC 2015, SpringerGoogle Scholar
  69. 69.
    Ture F, Jojic O (2016) Simple and effective question answering with recurrent neural networks, arXiv preprint arXiv:1606.05029
  70. 70.
    Unger C, Bühmann L, Lehmann J, Ngonga Ngomo A-C, Gerber D, Cimiano P (2012) Template-based question answering over RDF data. In: Proceedings of the 21st international conference on world wide web, ACM, pp 639–648Google Scholar
  71. 71.
    Unger C, Forascu C, Lopez V, Ngomo A-CN., Cabrio E, Cimiano P, Walter S (2014) Question answering over linked data (QALD-4). In: Working notes for CLEF 2014 conferenceGoogle Scholar
  72. 72.
    Unger C, Forascu C, Lopez V, Ngomo A-CN, Cabrio E, Cimiano P, Walter S (2015) Answering over linked data (QALD-5). In: Working notes for CLEF 2015 conferenceGoogle Scholar
  73. 73.
    Unger C, Ngomo A-CN, Cabrio E, Cimiano (2016) 6th open challenge on question answering over linked data (QALD-6). In: The semantic web: ESWC 2016 challengesGoogle Scholar
  74. 74.
    Usbeck R, Ngomo A-CN, Bühmann L, Unger C (2015) HAWK–hybrid question answering using linked data. In: The semantic web. Latest advances and new domains, SpringerGoogle Scholar
  75. 75.
    Walter S, Unger C, Cimiano P (2014) M-ATOLL: a framework for the lexicalization of ontologies in multiple languages. In: The semantic web–ISWC 2014, SpringerGoogle Scholar
  76. 76.
    Walter S, Unger C, Cimiano P, Bär D (2012) Evaluation of a layered approach to question answering over linked data. In: The semantic web–ISWC 2012, SpringerGoogle Scholar
  77. 77.
    Wang Z, Yan S, Wang H, Huang X (2014) An overview of microsoft deep qa system on stanford webquestions benchmark. Technical report, Technical report, Microsoft ResearchGoogle Scholar
  78. 78.
    Wu F, Weld DS (2010) Open information extraction using Wikipedia. In: Proceedings of the 48th annual meeting of the association for computational linguistics, association for computational linguisticsGoogle Scholar
  79. 79.
    Xu K, Feng Y, Zhao D (2014) Xser@ QALD-4: answering natural language questions via phrasal semantic parsing. Natural Language Processing and Chinese Computing. Springer, pp 333–344Google Scholar
  80. 80.
    Yahya M, Berberich K, Elbassuoni S, Ramanath M, Tresp V, Weikum G (2012) Natural language questions for the web of data. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning, association for computational linguisticsGoogle Scholar
  81. 81.
    Yahya M, Berberich K, Elbassuoni S, Weikum G (2013) Robust question answering over the web of linked data. In: Proceedings of the 22nd ACM international conference on conference on information & knowledge management, ACMGoogle Scholar
  82. 82.
    Yang M-C, Duan N, Zhou M, Rim H-C (2014) Joint relational embeddings for knowledge-based question answering. In: EMNLPGoogle Scholar
  83. 83.
    Yang M-C, Lee D-G, Park S-Y, Rim H-C (2015) Knowledge-based question answering using the semantic embedding space. Exp Syst Appl 42(23):9086–9104. doi: 10.1016/j.eswa.2015.07.009
  84. 84.
    Yao X (2015) Lean question answering over freebase from scratch. In: HLT-NAACLGoogle Scholar
  85. 85.
    Yao X, Van Durme B (2014) Information extraction over structured data: question answering with freebase. In: ACL (1), CiteseerGoogle Scholar
  86. 86.
    Yates A, Cafarella M, Banko M, Etzioni O, Broadhead M, Soderland S (2007) Textrunner: open information extraction on the web. In: Proceedings of human language technologies: the annual conference of the north american chapter of the association for computational linguistics: demonstrations, association for computational linguisticsGoogle Scholar
  87. 87.
    Yavuz S, Gur I, Su Y, Srivatsa M, Yan X (2016) Improving semantic parsing via answer type inference. In: EMNLP, pp 149–159Google Scholar
  88. 88.
    Yih S W.-T., Chang M-W., He X, Gao J (2015) Semantic parsing via staged query graph generation: question answering with knowledge base In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th International joint conference on natural language processing, (vol 1) long papers. Association for Computational Linguistics, Beijing, pp 1321–1331. doi: 10.3115/v1/P15-1128
  89. 89.
    Yih W-T, Richardson M, Meek C, Chang M-W, Suh J (2016) The value of semantic parse labeling for knowledge base question answering. In: ACL (2)Google Scholar
  90. 90.
    Yin W, Yu M, Xiang B, Zhou B, Schütze H (2016) Simple question answering by attentive convolutional neural network, arXiv preprint arXiv:1606.03391
  91. 91.
    Yosef MA, Hoffart J, Bordino I, Spaniol M, Weikum G (2011) Aida: An online tool for accurate disambiguation of named entities in text and tables. In: Proceedings of the VLDB Endowment 4Google Scholar
  92. 92.
    Zettlemoyer L S., Collins M (2012) Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. arXiv preprint arXiv:1207.1420
  93. 93.
    Zhang Y, Liu K, He S, Ji G, Liu Z, Wu H, Zhao J (2016) Question answering over knowledge base with neural attention combining global knowledge information. arXiv preprint arXiv:1606.00979
  94. 94.
    Zhange Y, He S, Liu K, Zhao J (2016) A joint model for question answering over multiple knowledge bases In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. AAAI Press, Phoenix, pp 3094–3100Google Scholar
  95. 95.
    Zhu C, Ren K, Liu X, Wang H, Tian Y, Yu Y (2015) A graph traversal based approach to answer non-aggregation questions over DBpedia, arXiv preprint arXiv:1510.04780
  96. 96.
    Zou L, Huang R, Wang H, Yu JX, He W, Zhao D (2014) Natural language question answering over RDF: a graph data driven approach. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data, ACMGoogle Scholar

Copyright information

© Springer-Verlag London Ltd. 2017

Authors and Affiliations

  1. 1.Laboratoire Hubert Curien, CNRS UMR 5516Université de LyonSaint-ÉtienneFrance
  2. 2.IBM Research IrelandDublinIreland

Personalised recommendations