Abstract
Answering questions on scholarly knowledge comprising text and other artifacts is a vital part of any research life cycle. Querying scholarly knowledge and retrieving suitable answers is currently hardly possible due to the following primary reason: machine inactionable, ambiguous and unstructured content in publications. We present JarvisQA, a BERT based system to answer questions on tabular views of scholarly knowledge graphs. Such tables can be found in a variety of shapes in the scholarly literature (e.g., surveys, comparisons or results). Our system can retrieve direct answers to a variety of different questions asked on tabular data in articles. Furthermore, we present a preliminary dataset of related tables and a corresponding set of natural language questions. This dataset is used as a benchmark for our system and can be reused by others. Additionally, JarvisQA is evaluated on two datasets against other baselines and shows an improvement of two to three folds in performance compared to related methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
Fetched from https://www.orkg.org/orkg/c/Zg4b1N.
- 6.
- 7.
- 8.
References
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K. (ed.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52
Bloehdorn, S.: Ontology-based question answering for digital libraries. In: Kovács, L., Fuhr, N., Meghini, C. (eds.) ECDL 2007. LNCS, vol. 4675, pp. 14–25. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74851-9_2
Bornmann, L., Mutz, R.: Growth rates of modern science: a bibliometric analysis based on the number of publications and cited references. J. Assoc. Inf. Sci. Technol. 66(11), 2215–2222 (2015). https://doi.org/10.1002/asi.23329
Bosman, J., et al.: The scholarly commons - principles and practices to guide research communication. https://doi.org/10.31219/OSF.IO/6C2XT
Cheng, J., Reddy, S., Saraswat, V., Lapata, M.: Learning structured natural language representations for semantic parsing. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 44–55. Association for Computational Linguistics, Stroudsburg (2017). https://doi.org/10.18653/v1/P17-1005
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–418. Association for Computational Linguistics, Stroudsburg (2019). https://doi.org/10.18653/v1/N19-1423
Diefenbach, D., Lopez, V., Singh, K., Maret, P.: Core techniques of question answering systems over knowledge bases: a survey. Knowl. Inf. Syst. 55(3), 529–569 (2017). https://doi.org/10.1007/s10115-017-1100-y
Diefenbach, D., Lully, V., Migliatti, P.H., Singh, K., Qawasmeh, O., Maret, P.: QAnswer: a question answering prototype bridging the gap between a considerable part of the LOD cloud and end-users. In: The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019, pp. 3507–3510. Association for Computing Machinery, Inc., May 2019. https://doi.org/10.1145/3308558.3314124
Dua, M., Kumar, S., Virk, Z.S.: Hindi language graphical user interface to database management system. In: Proceedings - 2013 12th International Conference on Machine Learning and Applications, ICMLA 2013, vol. 2, pp. 555–559. IEEE Computer Society (2013). https://doi.org/10.1109/ICMLA.2013.176
Groth, P., Gibson, A., Velterop, J.: The anatomy of a nanopublication. Inf. Serv. Use 30(1–2), 51–56 (2010). https://doi.org/10.3233/ISU-2010-0613
Hersh, W.R.: Information Retrieval and Digital Libraries. In: Chen, H., Fuller, S.S., Friedman, C., Hersh, W. (eds.) Medical Informatics, Integrated Series in Information Systems, pp. 237–275. Springer, Boston (2005). https://doi.org/10.1007/0-387-25739-X_9
Jaradeh, M.Y., et al.: Open research knowledge graph: next generation infrastructure for semantic scholarly knowledge. Marina Del K-CAP19 (2019). https://doi.org/10.1145/3360901.3364435
Jauhar, S.K., Turney, P., Hovy, E.: TabMCQ: a dataset of general knowledge tables and multiple-choice questions, February 2016. http://arxiv.org/abs/1602.03960
Karki, B., et al.: Question answering via web extracted tables and pipelined models, March 2019. http://arxiv.org/abs/1903.07113
Krishnamurthy, J., Kollar, T.: Jointly learning to parse and perceive: connecting natural language to the physical world. Trans. Assoc. Comput. Linguist. 1, 193–206 (2013). https://doi.org/10.1162/tacl_a_00220
Kwiatkowski, T., Choi, E., Artzi, Y., Zettlemoyer, L.: Scaling semantic parsers with on-the-fly ontology matching. Technical report. www.wiktionary.com
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: a lite Bert for self-supervised learning of language representations, September 2019. http://arxiv.org/abs/1909.11942
Lin, J.: The web as a resource for question answering: perspectives and challenges. In: LREC. Las Palmas (2002). https://www.aclweb.org/anthology/L02-1085/
Lopez, V., Unger, C., Cimiano, P., Motta, E.: Evaluating question answering over linked data. J. Web Semant. 21, 3–13 (2013). https://doi.org/10.1016/j.websem.2013.05.006
Marx, E., Usbeck, R., Ngomo, A.C.N., Höffner, K., Lehmann, J., Auer, S.: Towards an open question answering architecture. In: ACM International Conference Proceeding Series, vol. 2014-September, pp. 57–60. Association for Computing Machinery, September 2014. https://doi.org/10.1145/2660517.2660519
Oelen, A., Jaradeh, M.Y., Stocker, M., Auer, S.: Generate fair literature surveys with scholarly knowledge graphs. In: JCDL 2020: The 20th ACM/IEEE Joint Conference on Digital Libraries (2020). https://doi.org/10.1145/3383583.3398520
Peroni, S., et al.: Research articles in simplified HTML: a web-first format for HTML-based scholarly articles. PeerJ Comput. Sci. 2017(10) (2017). https://doi.org/10.7717/peerj-cs.132
Peroni, S., Shotton, D.: Opencitations, an infrastructure organization for open scholarship. Quant. Sci. Stud. 1(1), 1–17 (2020). https://doi.org/10.1162/qss_a_00023
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100,000+ questions for machine comprehension of text. In: EMNLP 2016 - Proceedings Conference on Empirical Methods in Natural Language Processing, pp. 2383–2392. Association for Computational Linguistics (ACL) (2016). https://doi.org/10.18653/v1/d16-1264
Ramos, J.: Using TF-IDF to determine word relevance in document queries. Technical report
Schatz, B.R.: Information retrieval in digital libraries: bringing search to the net. Science 275(5298), 327–334 (1997). https://doi.org/10.1126/science.275.5298.327
Shekarpour, S., Marx, E., Ngonga Ngomo, A.C., Auer, S.: Sina: semantic interpretation of user queries for question answering on interlinked data. J. Web Semant. 30, 39–51 (2015). https://doi.org/10.1016/j.websem.2014.06.002
Singh, K., et al.: Why reinvent the wheel: let’s build question answering systems together. In: WWW 2018: Proceedings of the 2018 World Wide Web Conference, pp. 1247–1256. Association for Computing Machinery (ACM) (2018). https://doi.org/10.1145/3178876.3186023
Sinha, A., et al.: An overview of Microsoft Academic Service (MAS) and applications. In: WWW 2015 Companion - Proceedings of the 24th International Conference on World Wide Web, pp. 243–246. Association for Computing Machinery Inc., New York, May 2015. https://doi.org/10.1145/2740908.2742839
Vakulenko, S., Savenkov, V.: Tableqa: Question answering on tabular data, May 2017.http://arxiv.org/abs/1705.06504
Wang, H., Zhang, X., Ma, S., Sun, X., Wang, H., Wang, M.: A neural question answering model based on semi-structured tables. Technical report
Wilkinson, M.D., et al.: The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3(1), 1–9 (2016). https://doi.org/10.1038/sdata.2016.18
Wolf, T., et al.: Huggingface’s transformers: state-of-the-art natural language processing, October 2019. http://arxiv.org/abs/1910.03771
Yahya, M., Berberich, K., Elbassuoni, S., Ramanath, M., Tresp, V., Weikum, G.: Natural language questions for the web of data. Technical report (2012)
Yin, J., Jiang, X., Lu, Z., Shang, L., Li, H., Li, X.: Neural generative question answering. In: IJCAI International Joint Conference on Artificial Intelligence 2016-January, pp. 2972–2978 , December 2015. http://arxiv.org/abs/1512.01337
Zhang, Z., Wu, Y., Zhou, J., Duan, S., Zhao, H., Wang, R.: SG-Net: syntax-guided machine reading comprehension, August 2019. http://arxiv.org/abs/1908.05147
Zinsser, W.: On Writing Well, 30th Anniversary Edition: An Informal Guide to Writing Nonfiction. HarperCollins (2012)
Zou, L., Huang, R., Wang, H., Yu, J.X., He, W., Zhao, D.: Natural language question answering over RDF - a graph data driven approach. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 313–324. Association for Computing Machinery (2014). https://doi.org/10.1145/2588555.2610525
Acknowledgments
This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and the TIB Leibniz Information Centre for Science and Technology. The authors would like to thank our colleagues Kheir Eddine Farfar, Manuel Prinz, and especially Allard Oelen and Vitalis Wiens for their valuable input and comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Jaradeh, M.Y., Stocker, M., Auer, S. (2020). Question Answering on Scholarly Knowledge Graphs. In: Hall, M., Merčun, T., Risse, T., Duchateau, F. (eds) Digital Libraries for Open Knowledge. TPDL 2020. Lecture Notes in Computer Science(), vol 12246. Springer, Cham. https://doi.org/10.1007/978-3-030-54956-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-54956-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-54955-8
Online ISBN: 978-3-030-54956-5
eBook Packages: Computer ScienceComputer Science (R0)