Abstract
Large-scale pre-trained language models (PLMs) such as BERT have recently achieved great success and become a milestone in natural language processing (NLP). It is now the consensus of the NLP community to adopt PLMs as the backbone for downstream tasks. In recent works on knowledge graph question answering (KGQA), BERT or its variants have become necessary in their KGQA models. However, there is still a lack of comprehensive research and comparison of the performance of different PLMs in KGQA. To this end, we summarize two basic KGQA frameworks based on PLMs without additional neural network modules to compare the performance of nine PLMs in terms of accuracy and efficiency. In addition, we present three benchmarks for larger-scale KGs based on the popular SimpleQuestions benchmark to investigate the scalability of PLMs. We carefully analyze the results of all PLMs-based KGQA basic frameworks on these benchmarks and two other popular datasets, WebQuestionSP and FreebaseQA, and find that knowledge distillation techniques and knowledge enhancement methods in PLMs are promising for KGQA. Furthermore, we test ChatGPT (https://chat.openai.com/), which has drawn a great deal of attention in the NLP community, demonstrating its impressive capabilities and limitations in zero-shot KGQA. We have released the code and benchmarks to promote the use of PLMs on KGQA (https://github.com/aannonymouuss/PLMs-in-Practical-KBQA).
This is a preview of subscription content,
to check access.






Similar content being viewed by others
Data Availability
All datasets and codes in this paper can be accessed from https://github.com/aannonymouuss/PLMs-in-Practical-KBQA.
Notes
These tasks are similar to named entity recognition, entity linking and relation extraction.
Scalability is the measure of a system’s ability to increase or decrease in performance and cost in response to changes in system processing demands. In our work, we explore the variation in accuracy performance and time cost with increasing KG size.
Available online: https://keywordtool.io/blog/most-asked-questions/ (accessed on 12 April 2022)
There is an existing KGQA approach based on KG embedding, which introduces knowledge representation learning, is proposed by [23] and is not included in our frameworks. This work focuses on comparing various PLMs, so the discussion of the effect of different KG embedding methods is reserved for future work.
Our basic frameworks are trained using an NVIDIA GeForce RTX 2080 TI
The dimension of h is \(1\times 1\). Different PLMs obtain h in different ways, e.g. \(h = w\cdot h_{\left[ CLS\right] }^{T}\) in BERT.
These data are from https://huggingface.co/.
The pre-processed datasets are available at https://github.com/aistairc/simple-qa-analysis.
The version of ChatGPT is Jau 30 Version, and the user’s access times are limited. We have released the script for accessing ChatGPT.
We have also tried to generate concise answers like entity names by specific instructions, but it leads to worse performance.
References
Manning, C.D.: Human language understanding & reasoning. In: Daedalus, pp. 127–138 (2022). https://doi.org/10.1162/daed_a_01905
Mohammed, S., Shi, P., Lin, J.J.: Strong baselines for simple question answering over knowledge graphs with and without neural networks. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, pp. 291–296 (2018). https://doi.org/10.18653/v1/n18-2047
Lukovnikov, D., Fischer, A., Lehmann, J.: Pretrained transformers for simple question answering over knowledge graphs. In: 18th International Semantic Web Conference, pp. 470–486 (2019). https://doi.org/10.1007/978-3-030-30793-6_27
Golub, D., He, X.: Character-level question answering with attention. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1598–1607 (2016). https://doi.org/10.18653/v1/d16-1166
Petrochuk, M., Zettlemoyer, L.: Simple questions nearly solved: a new upperbound and baseline approach. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 554–558 (2018). https://doi.org/10.18653/v1/d18-1051
Türe, F., Jojic, O.: No need to pay Attention: simple recurrent neural networks work! In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2866–2872 (2017). https://doi.org/10.18653/v1/d17-1307
Yu, M., Yin, W., Hasan, K.S., Santos, C.N., Xiang, B., Zhou, B.: Improved neural relation detection for knowledge base question answering. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 571–581 (2017). https://doi.org/10.18653/v1/P17-1053
Cui, H., Peng, T., Feng, L., Bao, T., Liu, L.: Simple question answering over knowledge graph enhanced by question pattern classification. In: Knowl. Inf. Syst., pp. 2741–2761 (2021). https://doi.org/10.1007/s10115-021-01609-w
Lukovnikov, D., Fischer, A., Lehmann, J., Auer, S.: Neural network-based question answering over knowledge graphs on word and character level. In: Proceedings of the 26th International Conference on World Wide Web, pp. 1211–1220 (2017). https://doi.org/10.1145/3038912.3052675
Hao, Y., Liu, H., He, S., Liu, K., Zhao, J.: Pattern-revising enhanced simple question answering over knowledge bases. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 3272–3282 (2018). https://aclanthology.org/C18-1277/
Yin, W., Yu, M., Xiang, B., Zhou, B., Schütze, H.: Simple question answering by attentive convolutional neural network. In: Proceedings of the 26th International Conference on Computational Linguistics, pp. 1746–1756 (2016). https://aclanthology.org/C16-1164/
Zhao W., Chung T., Goyal AK., Metallinou A.: Simple question answering with subgraph ranking and joint-scoring. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, pp. 324–334 (2019). https://doi.org/10.18653/v1/n19-1029
Hao, Y., Zhang, Y., Liu, K., He, S., Liu, Z., Wu, H., Zhao, J.: An end-to-end model for question answering over knowledge base with cross-attention combining global knowledge. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 221–231 (2017). https://doi.org/10.18653/v1/P17-1021
Luo, D., Su, J., Yu, S.: A BERT-based approach with relation-aware attention for knowledge base question answering. In: 2020 International Joint Conference on Neural Networks, pp. 1–8 (2020). https://doi.org/10.1109/IJCNN48605.2020.9207186
Dai, Z., Li, L., Xu, W.: CFO: conditional focused neural question answering with large-scale knowledge bases. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 800–810 (2016). https://doi.org/10.18653/v1/p16-1076
Lan, Y., Wang, S., Jiang, J.: knowledge base question answering with a matching-aggregation model and question-specific contextual relations. In: IEEE ACM Trans. Audio Speech Lang. Process., pp. 1629–1638 (2019). https://doi.org/10.1109/TASLP.2019.2926125
Vaswani, A., Shazeer, N.M., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. In: Neural Computation, pp. 1735–1780 (1997)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Communications of the ACM, pp. 84–90 (2017). https://doi.org/10.1145/3065386
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, pp. 4171–4186 (2019). https://doi.org/10.18653/v1/n19-1423
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: RoBERTa: A robustly optimized BERT pretraining approach. In: ArXiv:1907.11692 (2019)
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: A lite BERT for self-supervised learning of language representations. In: 8th International Conference on Learning Representations (2020)
Huang, X., Zhang, J., Li, D., Li, P.: Knowledge graph embedding based question answering. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 105–113 (2019). https://doi.org/10.1145/3289600.3290956
Bordes, A., Usunier, N., Chopra, S., Weston, J.: Large-scale simple question answering with memory networks. In: arXiv:1506.02075 (2015)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Pennington, J., Socher, R., Manning, C.D.: GloVe: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543 (2014). https://doi.org/10.3115/v1/d14-1162
Reimers N., Gurevych I.: Sentence-BERT: Sentence embeddings using siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, pp. 3980–3990 (2019). https://doi.org/10.18653/v1/D19-1410
Li, B.Z., Min, S., Iyer, S., Mehdad, Y., Yih, W.: Efficient one-pass end-to-end entity linking for questions. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pp. 6433–6441 (2020). https://doi.org/10.18653/v1/2020.emnlp-main.522
Wu, L.Y., Petroni, F., Josifoski, M., Riedel, S., Zettlemoyer, L.: Scalable zero-shot entity linking with dense entity retrieval. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pp. 6397–6407 (2020). https://doi.org/10.18653/v1/2020.emnlp-main.519
Chen, S., Wang, J., Jiang, F., Lin, C.: Improving entity linking by modeling latent entity type information. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, pp. 7529–7537 (2020)
Oliya, A., Saffari, A., Sen, P., Ayoola, T.: End-to-end entity resolution and question answering using differentiable knowledge graphs. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 4193–4200 (2021). https://doi.org/10.18653/v1/2021.emnlp-main.345
Wang, Z., Ng, P.K., Nallapati, R., Xiang, B.: Retrieval, re-ranking and multi-task learning for knowledge-base question answering. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, pp. 347–357 (2021). https://doi.org/10.18653/v1/2021.eacl-main.26
Yamada, I., Asai, A., Shindo, H., Takeda, H., Matsumoto, Y.: LUKE: deep contextualized entity representations with entity-aware self-attention. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pp. 6442–6454 (2020). https://doi.org/10.18653/v1/2020.emnlp-main.523
Zhang, T., Wang, C., Hu, N., Qiu, M., Tang, C., He, X., Huang, J.: DKPLM: decomposable knowledge-enhanced pre-trained language model for natural language understanding. In: Thirty-Sixth AAAI Conference on Artificial Intelligence, pp. 11703–11711 (2022)
Zhang, Z., Han, X., Liu, Z., Jiang, X., Sun, M., Liu, Q.: ERNIE: enhanced language representation with informative entities. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, pp. 1441–1451 (2019). https://doi.org/10.18653/v1/p19-1139
Wang, X., Gao, T., Zhu, Z., Liu, Z., Li, J., Tang, J.: KEPLER: a unified model for knowledge embedding and pre-trained language representation. In: Transactions of the Association for Computational Linguistics, 9, pp. 176–194. (2021). https://doi.org/10.1162/tacl_a_00360
Peters, M.E., Neumann, M., RobertL.Logan, I., Schwartz, R., Joshi, V., Singh, S., Smith, N.A.: Knowledge enhanced contextual word representations. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, pp. 43–54 (2019). https://doi.org/10.18653/v1/D19-1005
Bollacker, K.D., Evans, C., Paritosh, P.K., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 1247–1250 (2008). https://doi.org/10.1145/1376616.1376746
Danny Sullivan.: A reintroduction to our knowledge graph and knowledge panels. https://blog.google/products/search/about-knowledge-graph-and-knowledge-panels/(2020). Accessed 3 Oct 2022
Lan, Y., Jiang, J.: Query graph generation for answering multi-hop complex questions from knowledge bases. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 969–974 (2020). https://doi.org/10.18653/v1/2020.acl-main.91
Gu, Y., Kase, S.E., Vanni, M.T., Sadler, B.M., Liang, P., Yan, X., Su, Y.: Beyond I.I.D.: Three levels of generalization for question answering on knowledge bases. In: Proceedings of the Web Conference, pp. 3477–3488 (2021). https://doi.org/10.1145/3442381.3449992
Ye, X., Yavuz, S., Hashimoto, K., Zhou, Y., Xiong, C.: RNG-KBQA: generation augmented iterative ranking for knowledge base question answering. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, pp. 6032–6043 (2022). https://doi.org/10.18653/v1/2022.acl-long.417
Gu, Y., Su, Y.: ArcaneQA: dynamic program induction and contextualized encoding for knowledge base question answering. In: Proceedings of the 29th International Conference on Computational Linguistics, pp. 1718–1731 (2022)
Chen, S., Liu, Q., Yu, Z., Lin, C., Lou, J., Jiang, F.: ReTraCk: a flexible and efficient framework for knowledge base question answering. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, pp. 325–336 (2021). https://doi.org/10.18653/v1/2021.acl-demo.39
Qin, K., Li, C., Pavlu, V., Aslam, J.A.: Improving query graph generation for complex question answering over knowledge base. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 4201–4207 (2021). https://doi.org/10.18653/v1/2021.emnlp-main.346
Xie, T., Wu, C., Shi, P., Zhong, R., Scholak, T., Yasunaga, M., Wu, C., Zhong, M., Yin, P., Wang, S.I., Zhong, V., Wang, B., Li, C., Boyle, C., Ni, A., Yao, Z., Radev, D., Xiong, C., Kong, L., Zhang, R., Smith, N.A., Zettlemoyer, L., Yu, T.: UnifiedSKG unifying and multi-tasking structured knowledge grounding with text-to-text language models. In: arXiv:2201.05966 (2022)
Jiao, X., Yin, Y., Shang, L., Jiang, X., Chen, X., Li, L., Wang, F., Liu, Q.: TinyBERT Distilling BERT for natural language understanding. In: Findings of the Association for Computational Linguistics, pp. 4163–4174 (2020). https://doi.org/10.18653/v1/2020.findings-emnlp.372
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT smaller, faster, cheaper and lighter. In: arXiv:1910.01108 (2019)
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training. https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf (2018). Accessed 4 Oct 2022
Yang, Z., Dai, Z., Yang, Y., Carbonell, J.G., Salakhutdinov, R., Le, Q.V.: XLNet generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems, pp. 5754–5764 (2019)
Zhang, C., Lai, Y., Feng, Y., Zhao, D.: A review of deep learning in question answering over knowledge bases. In: AI Open, pp. 205–215 (2021)
Lan, Y., He, G., Jiang, J., Jiang, J., Zhao, W.X., Wen, J.: Complex knowledge base question answering a survey. In: IEEE TKDE (2021)
Gu, Y., Pahuja, V., Cheng, G., Su, Y.: Knowledge base question answering: a semantic parsing perspective. ArXiv, In: arXiv:2209.04994(2022)
Qiu, X., Sun, T., Xu, Y., Shao, Y., Dai, N., Huang, X.: Pre-trained models for natural language processing a survey. In: Science China Technological Sciences, pp. 1872–1897 (2020)
Han, N., Topic, G., Noji, H., Takamura, H., Miyao, Y.: An empirical analysis of existing systems and datasets toward general simple question answering. In: COLING, pp. 5321–5334 (2020)
Jiang, K., Wu, D., Jiang, H.: FreebaseQA: a new factoid QA data set matching trivia-style question-answer pairs with freebase. In: North American Chapter of the Association for Computational Linguistics, pp. 318–323 (2019)
Yih, W., Richardson, M., Meek, C., Chang, M., Suh, J.: The value of semantic parse labeling for knowledge base question answering. In: Annual Meeting of the Association for Computational Linguistics (2016)
Hu, N., Bi, S., Qi, G., Wang, M., Hua, Y., Shen, S.: Improving core path reasoning for the weakly supervised knowledge base question answering. In: DASFAA, pp. 162–170 (2022)
Zhang, J., Zhang, X., Yu, J., Tang, J., Tang, J., Li, C., Chen, H.: Subgraph retrieval enhanced model for multi-hop knowledge base question answering. In: Annual Meeting of the Association for Computational Linguistics, pp. 5773–5784 (2022)
Das, R., Zaheer, M., Thai, D.N., Godbole, A., Perez, E., Lee, J., Tan, L., Polymenakos, L., McCallum, A.: Case-based reasoning for natural language queries over knowledge bases. In: Conference on Empirical Methods in Natural Language Processing, pp. 9594–9611 (2021)
Ye, X., Yavuz, S., Hashimoto, K., Zhou, Y., Xiong, C.: RNG-KBQA: generation augmented iterative ranking for knowledge base question answering. In: Annual Meeting of the Association for Computational Linguistics, pp. 6032–6043 (2021)
Qin, C., Zhang, A., Zhang, Z., Chen, J., Yasunaga, M., Yang, D.: Is ChatGPT a general-purpose natural language processing task solver? In: arXiv:2302.06476 (2023)
Bang, Y., Cahyawijaya, S., Lee, N., Dai, W., Su, D., Wilie, B., Lovenia, H., Ji, Z., Yu, T., Chung, W., Do, Q.V., Xu, Y., Fung, P.: A multitask, multilingual, multimodal evaluation of ChatGPT on reasoning, hallucination, and interactivity. In: arXiv:2302.04023 (2023)
Guo, B., Zhang, X., Wang, Z., Jiang, M., Nie, J., Ding, Y., Yue, J., Wu, Y.: How close is ChatGPT to human experts? comparison corpus, evaluation, and detection. In: arXiv:2301.07597 (2023)
Christiano, P.F., Leike, J., Brown, T.B., Martic, M., Legg, S., Amodei, D.: Deep reinforcement learning from human preferences. In: Neural Information Processing Systems, pp. 4299–4307 (2017)
Funding
This work is supported by National Nature Science Foundation of China (No. U21A20488).
Author information
Authors and Affiliations
Contributions
Nan Hu: Conceptualization, Methodology, Software, Writing - Original Draft. Yike Wu: Investigation, Software, Validation. Guilin Qi: Conceptualization, Methodology, Writing - review & editing. Dehai Min: Software. Jiaoyan Chen: Writing - review & editing. Jeff Z. Pan: Writing - review & editing. Zafar Ali: Validation.
Corresponding author
Ethics declarations
Ethical Approval
Not applicable
Competing Interests
The authors declare that there are no competing interest regarding the publication of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: Special Issue on Knowledge-Graph-Enabled Methods and Applications for the Future Web Guest Editors: Xin Wang, Jeff Pan, Qingpeng Zhang, and Yuan-Fang Li.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hu, N., Wu, Y., Qi, G. et al. An empirical study of pre-trained language models in simple knowledge graph question answering. World Wide Web 26, 2855–2886 (2023). https://doi.org/10.1007/s11280-023-01166-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-023-01166-y