Advertisement

A Quantitative Evaluation of Natural Language Question Interpretation for Question Answering Systems

  • Takuto AsakuraEmail author
  • Jin-Dong Kim
  • Yasunori Yamamoto
  • Yuka Tateisi
  • Toshihisa Takagi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11341)

Abstract

Systematic benchmark evaluation plays an important role in the process of improving technologies for Question Answering (QA) systems. While currently there are a number of existing evaluation methods for natural language (NL) QA systems, most of them consider only the final answers, limiting their utility within a black box style evaluation. Herein, we propose a subdivided evaluation approach to enable finer-grained evaluation of QA systems, and present an evaluation tool which targets the NL question (NLQ) interpretation step, an initial step of a QA pipeline. The results of experiments using two public benchmark datasets suggest that we can get a deeper insight about the performance of a QA system using the proposed approach, which should provide a better guidance for improving the systems, than using black box style approaches.

References

  1. 1.
    Balikas, G., Krithara, A., Partalas, I., Paliouras, G.: BioASQ: a challenge on large-scale biomedical semantic indexing and question answering. In: Multimodal Retrieval in the Medical Domain, pp. 26–39 (2015)Google Scholar
  2. 2.
    Ben Abacha, A., Zweigenbaum, P.: Medical question answering: translating medical questions into SPARQL queries. In: Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, pp. 41–50 (2012)Google Scholar
  3. 3.
    Berant, J., Chou, A., Frostig, R., Liang, P.: Semantic parsing on freebase from question-answer pairs. In: Proceedings of EMNLP, pp. 1533–1544 (2013)Google Scholar
  4. 4.
    Bizer, C., Heath, T., Berners-Lee, T.: Linked data-the story so far. Int. J. Semant. Web Inf. Syst. 5(3), 1–22 (2009)CrossRefGoogle Scholar
  5. 5.
    Trivedi, P., Maheshwari, G., Dubey, M., Lehmann, J.: LC-QuAD: a corpus for complex question answering over knowledge graphs. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10588, pp. 210–218. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-68204-4_22CrossRefGoogle Scholar
  6. 6.
    Cai, Q., Yates, A.: Large-scale semantic parsing via schema matching and lexicon extension. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Long Papers, vol. 1, pp. 423–433 (2013)Google Scholar
  7. 7.
    Cimiano, P., Lopez, V., Unger, C., Cabrio, E., Ngonga Ngomo, A.-C., Walter, S.: Multilingual question answering over linked data (QALD-3): lab overview. In: Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds.) CLEF 2013. LNCS, vol. 8138, pp. 321–332. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-40802-1_30CrossRefGoogle Scholar
  8. 8.
    Cohen, K.B., Kim, J.-D.: Evaluation of SPARQL query generation from natural language questions. In: Proceedings of the Joint Workshop on NLP&LOD and SWAIE, pp. 3–7 (2013)Google Scholar
  9. 9.
    Dividino, R.Q., Gröner, G.: Which of the following SPARQL queries are Similar? Why? In: Proceedings of the First International Conference on Linked Data for Information Extraction, pp. 2–13 (2013)Google Scholar
  10. 10.
    Harris, S., Seaborne, A., Prud’hommeaux, E.: SPARQL 1.1 query language. W3C Recomm. 21(10) (2013)Google Scholar
  11. 11.
    Höffner, K.: Survey on challenges of question answering in the semantic web. Semant. Web 8(6), 895–920 (2017)CrossRefGoogle Scholar
  12. 12.
    Kim, J.-D.: OKBQA framework towards an open collaboration for development of natural language question-answering systems over knowledge bases. In: International Semantic Web Conference 2017 (2017)Google Scholar
  13. 13.
    Kim, J.-D., Choi, G., Kim, J.-U., Kim, E.-K., Choi, K.-S.: The open framework for developing knowledge base and question answering system. In: Proceedings of the 26th International Conference on Computational Linguistics, pp. 161–165 (2016)Google Scholar
  14. 14.
    Kim, J.-D., Cohen, K.B.: Natural language query processing for SPARQL generation: a prototype system for SNOMED CT. In: Proceedings of BioLINK SIG 2013, pp. 32–38 (2013)Google Scholar
  15. 15.
    Le, W., Kementsietsidis, A., Duan, S., Li, F.: Scalable multi-query optimization for SPARQL. In: 2012 IEEE 28th International Conference on Data Engineering (ICDE), pp. 666–677 (2012)Google Scholar
  16. 16.
    Lehmann, J.: DBpedia-a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)Google Scholar
  17. 17.
    Lopez, V., Unger, C., Cimiano, P., Motta, E.: Evaluating question answering over linked data. Web Semant.: Sci., Serv. Agents World Wide Web 21, 3–13 (2013)CrossRefGoogle Scholar
  18. 18.
    Manning, C.D.: The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)Google Scholar
  19. 19.
    Miyao, Y., Tsujii, J.: Feature forest models for probabilistic HPSG parsing. Comput. Linguist. 34(1), 35–80 (2008)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Talmor, A., Berant, J.: The web as a knowledge-base for answering complex questions. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 641–651 (2018)Google Scholar
  21. 21.
    Trivedi, P., Maheshwari, G., Dubey, M., Lehmann, J.: LC-QuAD: a corpus for complex question answering over knowledge graphs. In: International Semantic Web Conference 2017, pp. 210–218 (2017)Google Scholar
  22. 22.
    Unger, C., Ngomo, A.-C.N., Cabrio, E.: 6th open challenge on question answering over linked data (QALD-6). In: Sack, H., Dietze, S., Tordai, A., Lange, C. (eds.) SemWebEval 2016. CCIS, vol. 641, pp. 171–177. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46565-4_13CrossRefGoogle Scholar
  23. 23.
    Unger, C.: Question answering over linked data (QALD-4). In: Working Notes for CLEF 2014 Conference, pp. 1172–1180 (2014)Google Scholar
  24. 24.
    Unger, C.: Question answering over linked data (QALD-5). In: Working Notes of CLEF 2015 - Conference and Labs of the Evaluation Forum (2015)Google Scholar
  25. 25.
    Unger, C.: Template-based question answering over RDF data. In: Proceedings of the 21st International Conference on World Wide Web, pp. 639–648 (2012)Google Scholar
  26. 26.
    Usbeck, R., Ngomo, A.-C.N., Haarmann, B., Krithara, A., Röder, M., Napolitano, G.: 7th open challenge on question answering over linked data (QALD-7). In: Dragoni, M., Solanki, M., Blomqvist, E. (eds.) SemWebEval 2017. CCIS, vol. 769, pp. 59–69. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-69146-6_6CrossRefGoogle Scholar
  27. 27.
    Vrandečić, D.: Wikidata: a new platform for collaborative data collection. In: Proceedings of the 21st International Conference on World Wide Web, pp. 1063–1064 (2012)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Takuto Asakura
    • 1
    Email author
  • Jin-Dong Kim
    • 3
  • Yasunori Yamamoto
    • 3
  • Yuka Tateisi
    • 4
  • Toshihisa Takagi
    • 2
  1. 1.Department of InformaticsSOKENDAITokyoJapan
  2. 2.Department of Bioinformatics and Systems BiologyThe University of TokyoTokyoJapan
  3. 3.Database Center for Life ScienceChibaJapan
  4. 4.National Bioscience Database CenterTokyoJapan

Personalised recommendations