Skip to main content
Log in

KIMedQA: towards building knowledge-enhanced medical QA models

  • Research
  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Medical question-answering systems require the ability to extract accurate, concise, and comprehensive answers. They will better comprehend the complex text and produce helpful answers if they can reason on the explicit constraints described in the question’s textual context and the implicit, pertinent knowledge of the medical world. Integrating Knowledge Graphs (KG) with Language Models (LMs) is a common approach to incorporating structured information sources. However, effectively combining and reasoning over KG representations and language context remains an open question. To address this, we propose the Knowledge Infused Medical Question Answering system (KIMedQA), which employs two techniques viz. relevant knowledge graph selection and pruning of the large-scale graph to handle Vector Space Inconsistent (VSI) and Excessive Knowledge Information (EKI). The representation of the query and context are then combined with the pruned knowledge network using a pre-trained language model to generate an informed answer. Finally, we demonstrate through in-depth empirical evaluation that our suggested strategy provides cutting-edge outcomes on two benchmark datasets, namely MASH-QA and COVID-QA. We also compared our results to ChatGPT, a robust and very powerful generative model, and discovered that our model outperforms ChatGPT according to the F1 Score and human evaluation metrics such as adequacy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://openai.com/chatgpt

  2. https://github.com/ncbi-nlp/BioWordVec

  3. https://www.webmd.com/

  4. The code is available at https://github.com/aizan/kimedqa.

  5. BookCorpus Zhu et al. (2015), CC-NEWS, OpenWebText, Stories Trinh and Le (2018) and English Wikipedia

  6. https://openai.com/

References

  • Abbasiantaeb, Z., & Momtazi, S. (2022). Entity-aware answer sentence selection for question answering with transformer-based language models. Journal of Intelligent Information Systems, 59. https://doi.org/10.1007/s10844-022-00724-6

  • Auer, S., Bizer, C., Kobilarov, G., et al. (2007). Dbpedia: A nucleus for a web of open data. Lecture Notes in Computer Science, 4825. https://doi.org/10.1007/978-3-540-76298-0_52

  • Bodenreider, O. (2004). The unified medical language system (umls): integrating biomedical terminology. Nucleic Acids Research, 32. https://doi.org/10.1093/nar/gkh061

  • Bollacker, K., Evans, C., Paritosh, P., et al. (2008). Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. https://doi.org/10.1145/1376616.1376746

  • Buscaldi, D., Rosso, P., Gómez-Soriano, J. M., et al. (2010). Answering questions with an n-gram based passage retrieval engine. Journal of Intelligent Information Systems, 34. https://doi.org/10.1007/s10844-009-0082-y

  • Cao, Y., Hou, L., Li, J., et al. (2018). Joint representation learning of cross-lingual words and entities via attentive distant supervision. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. https://doi.org/10.18653/v1/D18-1021

  • Chen, D., Fisch, A., Weston, J., et al. (2017). Reading wikipedia to answer open-domain questions. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). https://doi.org/10.18653/v1/P17-1171

  • Clark, C., Gardner, M. (2018). Simple and effective multi-paragraph reading comprehension. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). https://doi.org/10.18653/v1/P18-1078

  • Cortes, E. G., Woloszyn, V., Barone, D., et al. (2022). A systematic review of question answering systems for non-factoid questions. Journal of Intelligent Information Systems. https://doi.org/10.1007/s10844-021-00655-8

  • Cui, Y., Che, W., Liu, T., et al. (2021). Pre-training with whole word masking for chinese bert. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29. https://doi.org/10.1109/TASLP.2021.3124365

  • Dai, Z., Yang, Z., Yang, Y., et al. (2019). Transformer-xl: Attentive language models beyond a fixed-length context. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1285

  • Devlin, J., Chang, M.W., Lee, K., et al. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805https://doi.org/10.18653/v1/N19-1423

  • Dimitrakis, E., Sgontzos, K., & Tzitzikas, Y. (2020). A survey on question answering systems over linked data and documents. Journal of Intelligent Information Systems, 55. https://doi.org/10.1007/s10844-019-00584-7

  • Faldu, K., Sheth, A., Kikani, P., et al. (2021). Ki-bert: Infusing knowledge context for better language and domain understanding. arXiv:2104.08145https://doi.org/10.48550/arXiv.2104.08145

  • Feng, G., Du, Z., Wu, X. (2018). A chinese question answering system in medical domain. Journal of Shanghai Jiaotong University (Science) 23. https://doi.org/10.1007/s12204-018-1982-1

  • Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5). https://doi.org/10.1037/h0031619

  • Han, X., Liu, Z., Sun, M. (2016). Joint representation learning of text and knowledge for knowledge graph completion. arXiv:1611.04125https://doi.org/10.48550/arXiv.1611.04125

  • Huang, K., Altosaar, J., Ranganath, R. (2019). Clinicalbert: Modeling clinical notes and predicting hospital readmission. arXiv:1904.05342https://doi.org/10.48550/arXiv.1904.05342

  • Joshi, M., Chen, D., Liu, Y., et al. (2020). Spanbert: Improving pre-training by representing and predicting spans. Transactions of the Association for Computational Linguistics, 8. https://doi.org/10.1162/tacl_a_00300

  • Khashabi, D., Min, S., Khot, T., et al. (2020). Unifiedqa: Crossing format boundaries with a single qa system. In: Findings of the Association for Computational Linguistics: EMNLP 2020. https://doi.org/10.18653/v1/2020.findings-emnlp.171

  • Kingma, D.P., Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412.6980https://doi.org/10.48550/arXiv.1412.6980

  • Kursuncu, U., Gaur, M., Sheth, A. (2020). Knowledge infused learning (k-il): Towards deep incorporation of knowledge in deep learning. Proceedings of the AAAI 2020 Spring Symposium on Combining Machine Learning and Knowledge Engineering in Practice (AAAI-MAKE). https://doi.org/10.48550/arXiv.1912.00512

  • Lee, J., Yoon, W., Kim, S., et al. (2020). Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36. https://doi.org/10.1093/bioinformatics/btz682

  • Li, Z., Sun, Y., Zhu, J., et al. (2021). Improve relation extraction with dual attention-guided graph convolutional networks. Neural Computing and Applications, 33. https://doi.org/10.1007/s00521-020-05087-z

  • Lin, B.Y., Chen, X., Chen, J., et al. (2019). Kagnet: Knowledge-aware graph networks for commonsense reasoning. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)https://doi.org/10.18653/v1/D19-1282

  • Lin, C. Y., Wu, Y. H., & Chen, A. L. (2021). Selecting the most helpful answers in online health question answering communities. Journal of Intelligent Information Systems, 57. https://doi.org/10.1007/s10844-021-00640-1

  • Liu, W., Zhou, P., Zhao, Z., et al. (2020). K-bert: Enabling language representation with knowledge graph. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 34. https://doi.org/10.1609/aaai.v34i03.5681

  • Liu, Y., Ott, M., Goyal, N., et al. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv:1907.11692https://doi.org/10.48550/arXiv.1907.11692

  • Lukovnikov, D., Fischer, A., Lehmann, J., et al. (2017). Neural network-based question answering over knowledge graphs on word and character level. In: Proceedings of the 26th International Conference on World Wide Web. https://doi.org/10.1145/3038912.3052675

  • Lv, S., Guo, D., Xu, J., et al. (2020). Graph-based reasoning over heterogeneous external knowledge for commonsense question answering. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 34. https://doi.org/10.1609/aaai.v34i05.6364

  • Lyu, K., Tian, Y., Shang, Y., et al. (2023). Causal knowledge graph construction and evaluation for clinical decision support of diabetic nephropathy. Journal of Biomedical Informatics. https://doi.org/10.1016/j.jbi.2023.104298

  • Mikolov, T., Chen, K., Corrado, G., et al. (2013). Efficient estimation of word representations in vector space. arXiv:1301.3781https://doi.org/10.48550/arXiv.1301.3781

  • Möller, T., Reina, A., Jayakumar, R., et al. (2020). Covid-qa: A question answering dataset for covid-19. In: Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 2020. https://aclanthology.org/2020.nlpcovid19-acl.18

  • Nentidis, A., Katsimpras, G., Vandorou, E., et al. (2022). Overview of bioasq 2022: The tenth bioasq challenge on large-scale biomedical semantic indexing and question answering. In: Experimental IR Meets Multilinguality, Multimodality, and Interaction: 13th International Conference of the CLEF Association, CLEF 2022, Bologna, Italy, September 5–8, 2022, Proceedings. Springer. https://doi.org/10.1007/978-3-031-13643-6_22

  • Pampari, A., Raghavan, P., Liang, J., et al. (2018). emrqa: A large corpus for question answering on electronic medical records. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). https://doi.org/10.18653/v1/D18-1258

  • Park, C., Park, J., & Park, S. (2020). Agcn: Attention-based graph convolutional networks for drug-drug interaction extraction. Expert Systems with Applications, 159. https://doi.org/10.1016/j.eswa.2020.113538

  • Peng, Z., Yu, H., & Jia, X. (2022). Path-based reasoning with k-nearest neighbor and position embedding for knowledge graph completion. Journal of Intelligent Information Systems. https://doi.org/10.1007/s10844-021-00671-8

  • Petroni, F., Rocktäschel, T., Lewis, P., et al. (2019). Language models as knowledge bases? In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). https://doi.org/10.18653/v1/D19-1250

  • Qin, C., Zhang, A., Zhang, Z., et al. (2023). Is chatgpt a general-purpose natural language processing task solver?. arXiv:2302.06476https://doi.org/10.48550/arXiv.2302.06476

  • Qiu, L., Xiao, Y., Qu, Y., et al. (2019). Dynamically fused graph network for multi-hop reasoning. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguisticshttps://doi.org/10.18653/v1/P19-1617

  • Qiu, Y., Li, M., Wang, Y., et al. (2018). Hierarchical type constrained topic entity detection for knowledge base question answering. In: Companion Proceedings of the The Web Conference 2018. https://doi.org/10.1145/3184558.3186916

  • Raffel, C., Shazeer, N., Roberts, A., et al. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(140). https://doi.org/10.5555/3455716.3455856

  • Roberts, K., Simpson, M., Demner-Fushman, D., et al. (2016). State-of-the-art in biomedical literature retrieval for clinical cases: a survey of the trec 2014 cds track. Information Retrieval Journal, 19. https://doi.org/10.1007/s10791-015-9259-x

  • Savenkov, D., Agichtein, E. (2016). When a knowledge base is not enough: Question answering over knowledge bases with external text data. In: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. https://doi.org/10.1145/2911451.2911536

  • Seo, M., Kembhavi, A., Farhadi, A., et al. (2016). Bidirectional attention flow for machine comprehension. In: International Conference on Learning Representations (ICLR). https://doi.org/10.48550/arXiv.1611.01603

  • Soldaini, L., Goharian, N. (2016). Quickumls: a fast, unsupervised approach for medical concept extraction. In: MedIR workshop, SIGIR. https://ir.cs.georgetown.edu/downloads/quickumls.pdf

  • Speer, R., Chin, J., Havasi, C. (2017). Conceptnet 5.5: An open multilingual graph of general knowledge. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 31. https://doi.org/10.1609/aaai.v31i1.11164

  • Suchanek, F.M., Kasneci, G., Weikum, G. (2007). Yago: a core of semantic knowledge. In: Proceedings of the 16th international conference on World Wide Web. https://doi.org/10.1145/1242572.1242667

  • Sun, Y., Wang, S., Li, Y., et al. (2019). Ernie: Enhanced representation through knowledge integration. arXiv:1904.09223https://doi.org/10.48550/arXiv.1904.09223

  • Suster, S., Daelemans, W. (2018). Clicr: a dataset of clinical case reports for machine reading comprehension. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. https://doi.org/10.18653/v1/N18-1140

  • Toutanova, K., Chen, D., Pantel, P., et al. (2015). Representing text for joint embedding of text and knowledge bases. In: Proceedings of the 2015 conference on Empirical Methods in Natural Language Processing (EMNLP). https://doi.org/10.18653/v1/D15-1174

  • Tran, T. N. T., Felfernig, A., Trattner, C., et al. (2021). Recommender systems in the healthcare domain: state-of-the-art and research issues. Journal of Intelligent Information Systems, 57. https://doi.org/10.1007/s10844-020-00633-6

  • Trinh, T.H., Le, Q.V. (2018). A simple method for commonsense reasoning. arXiv:1806.02847https://doi.org/10.48550/arXiv.1806.02847

  • Wang, Q., Mao, Z., Wang, B., et al. (2017). Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering, 29. https://doi.org/10.1109/TKDE.2017.2754499

  • Wang, X., Kapanipathi, P., Musa, R., et al. (2019). Improving natural language inference using external knowledge in the science questions domain. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 33. https://doi.org/10.1609/aaai.v33i01.33017208

  • Wang, X., Gao, T., Zhu, Z., et al. (2021). Kepler: A unified model for knowledge embedding and pre-trained language representation. Transactions of the Association for Computational Linguistics. https://doi.org/10.1162/tacl_a_00360

  • Wang, Z., Zhang, J., Feng, J., et al. (2014). Knowledge graph and text jointly embedding. In: Proceedings of the 2014 conference on Empirical Methods in Natural Language Processing (EMNLP)https://doi.org/10.3115/v1/D14-1167

  • Wang, Z., Ng, P., Ma, X., et al. (2019). Multi-passage bert: A globally normalized bert model for open-domain question answering. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). https://doi.org/10.18653/v1/D19-1599

  • Wishart, D. S., Feunang, Y. D., Guo, A. C., et al. (2018). Drugbank 5.0: a major update to the drugbank database for 2018. Nucleic Acids Research, 46. https://doi.org/10.1093/nar/gkx1037

  • Xiong, Y., Peng, H., Xiang, Y., et al. (2022). Leveraging multi-source knowledge for chinese clinical named entity recognition via relational graph convolutional network. Journal of Biomedical Informatics, 128. https://doi.org/10.1016/j.jbi.2022.104035

  • Yang, Z., Dai, Z., Yang, Y., et al. (2019). Xlnet: Generalized autoregressive pretraining for language understanding. Advances in Neural Information Processing Systems 32. https://doi.org/10.48550/arXiv.1906.08237

  • Yao, L., Mao, C., Luo, Y. (2019). Kg-bert: Bert for knowledge graph completion. arXiv:1909.03193https://doi.org/10.48550/arXiv.1909.03193

  • Yasunaga, M., Ren, H., Bosselut, A., et al. (2021). Qa-gnn: Reasoning with language models and knowledge graphs for question answering. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. https://doi.org/10.18653/v1/2021.naacl-main.45

  • Yue, B., Gui, M., Guo, J., et al. (2017). An effective framework for question answering over freebase via reconstructing natural sequences. In: Proceedings of the 26th International Conference on World Wide Web Companion https://doi.org/10.1145/3041021.3054240

  • Zafar, A., Sahoo, S.K., Bhardawaj, H., et al. (2023). Ki-mag: A knowledge-infused abstractive question answering system in medical domain. Neurocomputing. https://doi.org/10.1016/j.neucom.2023.127141

  • Zhang, X., Bosselut, A., Yasunaga, M., et al. (2022). Greaselm: Graph reasoning enhanced language models for question answering. In: International Conference on Representation Learning (ICLR). https://doi.org/10.48550/arXiv.2201.08860

  • Zhang, Y., Chen, Q., Yang, Z., et al. (2019). Biowordvec, improving biomedical word embeddings with subword information and mesh. Scientific Data, 6. https://doi.org/10.1038/s41597-019-0055-0

  • Zhang, Y., Qi, P., Manning, C.D. (2018). Graph convolution over pruned dependency trees improves relation extraction. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). https://doi.org/10.18653/v1/D18-1244

  • Zheng, S., Rao, J., Song, Y., et al. (2021). Pharmkg: a dedicated knowledge graph benchmark for bomedical data mining. Briefings in Bioinformatics, 22. https://doi.org/10.1093/bib/bbaa344

  • Zhu, M., Ahuja, A., Juan, D.C., et al. (2020). Question answering with long multiple-span answers. In: Findings of the Association for Computational Linguistics: EMNLP 2020. https://doi.org/10.18653/v1/2020.findings-emnlp.342

  • Zhu, Y., Kiros, R., Zemel, R., et al. (2015). Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), https://doi.org/10.1109/ICCV.2015.11

Download references

Acknowledgements

Authors gratefully acknowledge the support from the projects “Percuro-A Holistic Solution for Text Mining“, sponsored by Wipro Ltd; and “Sevak-An Intelligent Indian Language Chabot“, sponsored by Imprint 2, SERB, Government of India.

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Authors

Contributions

Aizan: Conceptualization, Methodology, Software, Validation, Writing – original draft, Investigation. Sovan Kumar Sahoo: Conceptualization, Writing – original draft, Investigation. Deeksha Varshney: Writing – review and editing, Investigation error analysis. Amitava Das: Writing – review and editing, Supervision, Resources. Asif Ekbal: Writing – review and editing, Supervision, Resources.

Corresponding authors

Correspondence to Aizan Zafar or Asif Ekbal.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Ethical Approval

We make use of publicly available datasets. Without violating any copyright issues, we followed the policies of the datasets we used.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zafar, A., Sahoo, S.K., Varshney, D. et al. KIMedQA: towards building knowledge-enhanced medical QA models. J Intell Inf Syst (2024). https://doi.org/10.1007/s10844-024-00844-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10844-024-00844-1

Keywords

Navigation