Skip to main content

LLM-Based SPARQL Generation with Selected Schema from Large Scale Knowledge Base

  • Conference paper
  • First Online:
Knowledge Graph and Semantic Computing: Knowledge Graph Empowers Artificial General Intelligence (CCKS 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1923))

Included in the following conference series:

  • 911 Accesses

Abstract

Knowledge base question answering (KBQA) aims to answer natural language questions using structured knowledge bases. Common approaches include semantic parsing-based approaches and retrieval-based approaches. However, both approaches have some limitations. Retrieval-based methods struggle with complex reasoning requirements. Semantic parsing approaches have a complex reasoning process and cannot tolerate errors in earlier steps when generating the final logical form. In this paper, we proposed a large language model (LLM)-based SPARQL generation model, which accepts multiple candidate entities and relations as inputs, reducing the reliance on mention extraction and entity linking performance, and we found an entity combination strategy based on mentions, which can produce multiple SPARQL queries for a single question to boost the chances of finding the correct answer. Finally, our model achieves state-of-the-art performance in the CCKS2023 CKBQA competition, F1 score is 75.63%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/THUDM/ChatGLM-6B.

References

  1. Zhang, J., Chen, B., Zhang, L., et al.: Neural, symbolic and neural-symbolic reasoning on knowledge graphs. AI Open 2, 14–35 (2021). https://doi.org/10.1016/j.aiopen.2021.03.001

  2. Ye, X., Yavuz, S., Hashimoto, K., et al.: RnG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering. arXiv e-prints https://doi.org/10.48550/arXiv.2109.08678 (2021)

  3. Zhang, J., Zhang, X., Yu, J., et al.: Subgraph Retrieval Enhanced Model for Multi-hop Knowledge Base Question Answering (2022). https://doi.org/10.48550/arXiv.2202.13296

  4. He, G., Lan, Y., Jiang, J., et al.: Improving multi-hop knowledge base question answering by learning intermediate supervision signals. Proceedings of the 14th ACM International Conference on Web Search and Data Mining, pp. 553–561 (2021)

    Google Scholar 

  5. Chen, Y., Wu, L., Zaki, M.J.: Bidirectional Attentive Memory Networks for Question Answering Over Knowledge Bases. arXiv preprint arXiv:1903.02188 (2019)

  6. Saxena, A., Tripathi, A., Talukdar, P.: Improving multi-hop question answering over knowledge graphs using knowledge base embeddings. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 4498–4507 (2020)

    Google Scholar 

  7. Xu, K., Lai, Y., Feng, Y., et al.: Enhancing key-value memory neural networks for knowledge based question answering. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 2937–2947 (2019)

    Google Scholar 

  8. Sun, H., Dhingra, B., Zaheer, M., et al.: Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text. arXiv preprint arXiv:1809.00782 (2018)

  9. Sun, H., Bedrax-Weiss, T., Cohen, W.W.: Pullnet: Open Domain Question Answering with Iterative Retrieval on Knowledge Bases and Text. arXiv preprint arXiv:1904.09537 (2019)

  10. Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3), 1–45 (2009)

    Article  Google Scholar 

  11. Liang, P.: Lambda dependency-based compositional semantics. Computer Science (2013). https://doi.org/10.48550/arXiv.1309.4408.]

  12. Cao, S., et al.: KQApro: a dataset with explicit compositional programs for complex question answering over knowledge base. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 6101–6119 (2022)

    Google Scholar 

  13. Purkayastha, S., Dana, S., Garg, D., Khandelwal, D., Bhargav, G.S.: A deep neural approach to KGQA via SPARQL Silhouette generation. In: 2022 International Joint Conference on Neural Networks. IJCNN, IEEE, pp. 1–8 (2022)

    Google Scholar 

  14. Nie, L., et al: GraphQ IR: Unifying the Semantic Parsing of Graph Query Languages with One Intermediate Representation. ArXiv, arXiv:2205.12078 (2022)

  15. Cao, R., Chen, L., Chen, Z., Zhao, Y., Zhu, S., Yu, K.: LGESQL: Line graph enhanced text-to-SQL model with mixed local and non-local relations. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1–6, 2021, Association for Computational Linguistics, pp. 2541–2555 (2021)

    Google Scholar 

  16. Das, R., Zaheer, M., Thai, D., et al.: Case-Based Reasoning for Natural Language Queries Over Knowledge Bases. arXiv preprint arXiv:2104.08762 (2021)

  17. Kapanipathi, P., Abdelaziz, I., Ravishankar, S., et al.: Leveraging Abstract Meaning Representation for Knowledge Base Question Answering. arXiv preprint arXiv:2012.01707 (2020)

  18. Lan, Y., Jiang, J.: Query Graph Generation for Answering Multi-Hop Complex Questions from Knowledge Bases. Association for Computational Linguistics (2020)

    Google Scholar 

  19. Sun, Y., Zhang, L., Cheng, G., et al.: SPARQA: skeleton-based semantic parsing for complex questions over knowledge bases. Proceedings of the AAAI Conference on Artificial Intelligence 34(05), 8952–8959 (2020)

    Google Scholar 

  20. Qiu, Y., Wang, Y., Jin, X., et al.: Stepwise reasoning for multi-relation question answering over knowledge graph with weak supervision. Proceedings of the 13th International Conference on Web Search and Data Mining, pp. 474–482 (2020)

    Google Scholar 

  21. Das, R., Zaheer, M., Thai, D., et al.: Case-based reasoning for natural language queries over knowledge bases. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 9594-9611 (2021)

    Google Scholar 

  22. Huang, X., Kim, J.J., Zou, B.: Unseen entity handling in complex question answering over knowledge base via language generation. Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 547–557 (2021)

    Google Scholar 

  23. Xiong, G., Bao, J., Zhao, W., et al.: AutoQGS: auto-prompt for low-resource knowledge-based question generation from SPARQL. Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 2250–2259 (2022)

    Google Scholar 

  24. Floridi, L., Chiriatti, M.: GPT-3: Its Nature, Scope, Limits, and Consequences. [2023–08–17]. https://doi.org/10.1007/s11023-020-09548-1

  25. Chowdhery, A., Narang, S., Devlin, J., et al.: PaLM: Scaling Language Modeling with Pathways (2022). https://doi.org/10.48550/arXiv.2204.02311

  26. Touvron, H., Lavril, T., Izacard, G., et al.: Llama: Open and Efficient Foundation Language Models. arXiv preprint arXiv:2302.13971 (2023)

  27. Min, B., Ross, H., Sulem, E., et al.: Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey. arXiv preprint arXiv:2111.01243 (2021)

  28. Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. Advances in Neural Information Processing Syst. 30 (2017)

    Google Scholar 

  29. Kenton, J.D.M.W.C., Toutanova, L.K.: Bert: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NaacL-HLT, 1, p. 2 (2019)

    Google Scholar 

  30. Team O A I. ChatGPT: Optimizing Language Models for Dialogue (2022)

    Google Scholar 

  31. Zhao, W.X., Zhou, K., Li, J., et al.: A Survey of Large Language Models. arXiv preprint arXiv:2303.18223 (2023)

  32. Lin, Y., Ji, H., Huang, F., et al.: A joint neural model for information extraction with global features. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7999–8009 (2020)

    Google Scholar 

  33. Wu, L., Petroni, F., Josifoski, M., et al.: Scalable Zero-Shot Entity Linking with Dense Entity Retrieval. arXiv preprint arXiv:1911.03814 (2019)

  34. Li, B.Z., Min, S., Iyer, S., et al.: Efficient One-Pass End-to-End Entity Linking for Questions. arXiv preprint arXiv:2010.02413 (2020)

  35. Du, Z., Qian, Y., Liu, X., et al.: Glm: General Language Model Pretraining with Autoregressive Blank Infilling. arXiv preprint arXiv:2103.10360 (2021)

  36. Hu, E.J., Shen, Y., Wallis, P., et al.: Lora: Low-Rank Adaptation of Large Language Models. arXiv preprint arXiv:2106.09685 (2021)

  37. Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. Text Summarization Branches Out, pp. 74–81 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuangtao Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, S., Teng, M., Dong, X., Bo, F. (2023). LLM-Based SPARQL Generation with Selected Schema from Large Scale Knowledge Base. In: Wang, H., Han, X., Liu, M., Cheng, G., Liu, Y., Zhang, N. (eds) Knowledge Graph and Semantic Computing: Knowledge Graph Empowers Artificial General Intelligence. CCKS 2023. Communications in Computer and Information Science, vol 1923. Springer, Singapore. https://doi.org/10.1007/978-981-99-7224-1_24

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7224-1_24

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7223-4

  • Online ISBN: 978-981-99-7224-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics