Abstract
Systematic benchmark evaluation plays an important role in the process of improving technologies for Question Answering (QA) systems. While currently there are a number of existing evaluation methods for natural language (NL) QA systems, most of them consider only the final answers, limiting their utility within a black box style evaluation. Herein, we propose a subdivided evaluation approach to enable finer-grained evaluation of QA systems, and present an evaluation tool which targets the NL question (NLQ) interpretation step, an initial step of a QA pipeline. The results of experiments using two public benchmark datasets suggest that we can get a deeper insight about the performance of a QA system using the proposed approach, which should provide a better guidance for improving the systems, than using black box style approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Balikas, G., Krithara, A., Partalas, I., Paliouras, G.: BioASQ: a challenge on large-scale biomedical semantic indexing and question answering. In: Multimodal Retrieval in the Medical Domain, pp. 26–39 (2015)
Ben Abacha, A., Zweigenbaum, P.: Medical question answering: translating medical questions into SPARQL queries. In: Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, pp. 41–50 (2012)
Berant, J., Chou, A., Frostig, R., Liang, P.: Semantic parsing on freebase from question-answer pairs. In: Proceedings of EMNLP, pp. 1533–1544 (2013)
Bizer, C., Heath, T., Berners-Lee, T.: Linked data-the story so far. Int. J. Semant. Web Inf. Syst. 5(3), 1–22 (2009)
Trivedi, P., Maheshwari, G., Dubey, M., Lehmann, J.: LC-QuAD: a corpus for complex question answering over knowledge graphs. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10588, pp. 210–218. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68204-4_22
Cai, Q., Yates, A.: Large-scale semantic parsing via schema matching and lexicon extension. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Long Papers, vol. 1, pp. 423–433 (2013)
Cimiano, P., Lopez, V., Unger, C., Cabrio, E., Ngonga Ngomo, A.-C., Walter, S.: Multilingual question answering over linked data (QALD-3): lab overview. In: Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds.) CLEF 2013. LNCS, vol. 8138, pp. 321–332. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40802-1_30
Cohen, K.B., Kim, J.-D.: Evaluation of SPARQL query generation from natural language questions. In: Proceedings of the Joint Workshop on NLP&LOD and SWAIE, pp. 3–7 (2013)
Dividino, R.Q., Gröner, G.: Which of the following SPARQL queries are Similar? Why? In: Proceedings of the First International Conference on Linked Data for Information Extraction, pp. 2–13 (2013)
Harris, S., Seaborne, A., Prud’hommeaux, E.: SPARQL 1.1 query language. W3C Recomm. 21(10) (2013)
Höffner, K.: Survey on challenges of question answering in the semantic web. Semant. Web 8(6), 895–920 (2017)
Kim, J.-D.: OKBQA framework towards an open collaboration for development of natural language question-answering systems over knowledge bases. In: International Semantic Web Conference 2017 (2017)
Kim, J.-D., Choi, G., Kim, J.-U., Kim, E.-K., Choi, K.-S.: The open framework for developing knowledge base and question answering system. In: Proceedings of the 26th International Conference on Computational Linguistics, pp. 161–165 (2016)
Kim, J.-D., Cohen, K.B.: Natural language query processing for SPARQL generation: a prototype system for SNOMED CT. In: Proceedings of BioLINK SIG 2013, pp. 32–38 (2013)
Le, W., Kementsietsidis, A., Duan, S., Li, F.: Scalable multi-query optimization for SPARQL. In: 2012 IEEE 28th International Conference on Data Engineering (ICDE), pp. 666–677 (2012)
Lehmann, J.: DBpedia-a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)
Lopez, V., Unger, C., Cimiano, P., Motta, E.: Evaluating question answering over linked data. Web Semant.: Sci., Serv. Agents World Wide Web 21, 3–13 (2013)
Manning, C.D.: The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)
Miyao, Y., Tsujii, J.: Feature forest models for probabilistic HPSG parsing. Comput. Linguist. 34(1), 35–80 (2008)
Talmor, A., Berant, J.: The web as a knowledge-base for answering complex questions. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 641–651 (2018)
Trivedi, P., Maheshwari, G., Dubey, M., Lehmann, J.: LC-QuAD: a corpus for complex question answering over knowledge graphs. In: International Semantic Web Conference 2017, pp. 210–218 (2017)
Unger, C., Ngomo, A.-C.N., Cabrio, E.: 6th open challenge on question answering over linked data (QALD-6). In: Sack, H., Dietze, S., Tordai, A., Lange, C. (eds.) SemWebEval 2016. CCIS, vol. 641, pp. 171–177. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46565-4_13
Unger, C.: Question answering over linked data (QALD-4). In: Working Notes for CLEF 2014 Conference, pp. 1172–1180 (2014)
Unger, C.: Question answering over linked data (QALD-5). In: Working Notes of CLEF 2015 - Conference and Labs of the Evaluation Forum (2015)
Unger, C.: Template-based question answering over RDF data. In: Proceedings of the 21st International Conference on World Wide Web, pp. 639–648 (2012)
Usbeck, R., Ngomo, A.-C.N., Haarmann, B., Krithara, A., Röder, M., Napolitano, G.: 7th open challenge on question answering over linked data (QALD-7). In: Dragoni, M., Solanki, M., Blomqvist, E. (eds.) SemWebEval 2017. CCIS, vol. 769, pp. 59–69. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69146-6_6
Vrandečić, D.: Wikidata: a new platform for collaborative data collection. In: Proceedings of the 21st International Conference on World Wide Web, pp. 1063–1064 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Asakura, T., Kim, JD., Yamamoto, Y., Tateisi, Y., Takagi, T. (2018). A Quantitative Evaluation of Natural Language Question Interpretation for Question Answering Systems. In: Ichise, R., Lecue, F., Kawamura, T., Zhao, D., Muggleton, S., Kozaki, K. (eds) Semantic Technology. JIST 2018. Lecture Notes in Computer Science(), vol 11341. Springer, Cham. https://doi.org/10.1007/978-3-030-04284-4_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-04284-4_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04283-7
Online ISBN: 978-3-030-04284-4
eBook Packages: Computer ScienceComputer Science (R0)