Skip to main content
Log in

Knowledge Graph Enhanced Transformers for Diagnosis Generation of Chinese Medicine

  • New Technique for Chinese Medicine
  • Published:
Chinese Journal of Integrative Medicine Aims and scope Submit manuscript

Abstract

Chinese medicine (CM) diagnosis intellectualization is one of the hotspots in the research of CM modernization. The traditional CM intelligent diagnosis models transform the CM diagnosis issues into classification issues, however, it is difficult to solve the problems such as excessive or similar categories. With the development of natural language processing techniques, text generation technique has become increasingly mature. In this study, we aimed to establish the CM diagnosis generation model by transforming the CM diagnosis issues into text generation issues. The semantic context characteristic learning capacity was enhanced referring to Bidirectional Long Short-Term Memory (BILSTM) with Transformer as the backbone network. Meanwhile, the CM diagnosis generation model Knowledge Graph Enhanced Transformer (KGET) was established by introducing the knowledge in medical field to enhance the inferential capability. The KGET model was established based on 566 CM case texts, and was compared with the classic text generation models including Long Short-Term Memory sequence-to-sequence (LSTM-seq2seq), Bidirectional and Auto-Regression Transformer (BART), and Chinese Pre-trained Unbalanced Transformer (CPT), so as to analyze the model manifestations. Finally, the ablation experiments were performed to explore the influence of the optimized part on the KGET model. The results of Bilingual Evaluation Understudy (BLEU), Recall-Oriented Understudy for Gisting Evaluation 1 (ROUGE1), ROUGE2 and Edit distance of KGET model were 45.85, 73.93, 54.59 and 7.12, respectively in this study. Compared with LSTM-seq2seq, BART and CPT models, the KGET model was higher in BLEU, ROUGE1 and ROUGE2 by 6.00–17.09, 1.65–9.39 and 0.51–17.62, respectively, and lower in Edit distance by 0.47–3.21. The ablation experiment results revealed that introduction of BILSTM model and prior knowledge could significantly increase the model performance. Additionally, the manual assessment indicated that the CM diagnosis results of the KGET model used in this study were highly consistent with the practical diagnosis results. In conclusion, text generation technology can be effectively applied to CM diagnostic modeling. It can effectively avoid the problem of poor diagnostic performance caused by excessive and similar categories in traditional CM diagnostic classification models. CM diagnostic text generation technology has broad application prospects in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Availability of Data and Material

The datasets and source code used during the current study are available in the github repository, https://github.com/084016139/KGET.

References

  1. Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; June 2019; Minneapolis, Minnesota, USA: Association for Computational Linguistics; 2019.

    Google Scholar 

  2. Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, et al. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics; July 2020; Washington, USA (Online): Association for Computational Linguistics; 2020.

    Google Scholar 

  3. Shao YF, Geng ZC, Liu YT, Dai JQ, Yan H, Yang F, et al. CPT: a pre-trained unbalanced transformer for both Chinese language understanding and generation. Sci China Inf Sci; arXiv preprint arXiv:2109.05729, 2021.

  4. Cui Y, Che W, Liu T, Qin B, Yang Z. Pre-training with whole word masking for Chinese BERT. IEEE/ACM Trans Audio Speech Lang Process 2021;29:3504–3514.

    Article  Google Scholar 

  5. Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, et al. Attention-based bidirectional long short-term memory networks for relation classification. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics; August 2016; Berlin, Germany: Association for Computational Linguistics; 2016.

    Book  Google Scholar 

  6. Luong MT, Pham H, Manning CD. Effective approaches to attention-based neural machine translation. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing; September 2015; Lisbon, Portugal: Association for Computational Linguistics; 2015.

    Book  Google Scholar 

  7. Zhou M, Huang M, Zhu X. An interpretable reasoning network for multi-relation question answering. Proceedings of the 27th International Conference on Computational Linguistics; August 2018; Santa Fe, New Mexico, USA: Association for Computational Linguistics; 2018.

    Google Scholar 

  8. Sutskever I, Vinyals O, Le QV. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems 27 (NIPS 2014); September 2014; Montreal, Canada: NeurIPS Proceedings; 2014.

  9. Platt J. Sequential minimal optimization: a fast algorithm for training support vector machines. Microsoft; 1998 April. Report No.: MSR-TR-98-14. Accessed on September 20, 2022. Available at: https://www.microsoft.com/en-us/research/publication/sequential-minimal-optimization-a-fast-algorithm-for-training-support-vector-machines/.

  10. Abeywickrama T, Cheema MA, Taniar D. K-nearest neighbors on road networks: a journey in experimentation and in-memory implementation. Proceedings of the VLDB Endowment; 2016; New Delhi, India: Association for Computing Machinery (ACM); 2016.

    Book  Google Scholar 

  11. Rafael PdL, Fnu S, Kurt M, Matthew P, Gerilyn S. Convolutional neural networks: AAPG Explorer; 2018 [updated October]. Accessed on September 20, 2022. Available at: https://explorer.aapg.org/story/articleid/49527/convolutional-neural-networks.

  12. Zaremba W, Sutskever I, Vinyals O. Recurrent neural network regularization. International Conference on Learning Representations 2015 (ICLR 2015); May 2014; San Diego, CA, USA: Computational and Biological Learning Society; 2014.

    Google Scholar 

  13. Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J. LSTM: a search space odyssey. IEEE Trans Neural Netw Learn Syst 2016;28:2222–2232.

    Article  MathSciNet  PubMed  Google Scholar 

  14. Xia C, Deng F, Wang Y, Xu Z, Liu G, Xu J, et al. Classification research on syndromes of TCM based on SVM. 2009 2nd International Conference on Biomedical Engineering and Informatics; October 2009; Tianjin, China: IEEE; 2009.

    Book  Google Scholar 

  15. Zhou H, Hu G, Zhang X. Constitution identification of tongue image based on CNN. 2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI); October 2018; Beijing, China: IEEE; 2018.

    Book  Google Scholar 

  16. Liu GP, Li GZ, Wang YL, Wang YQ. Modelling of inquiry diagnosis for coronary heart disease in traditional Chinese medicine by using multi-label learning. BMC Compl Altern Med 2010;10:1–12.

    Article  CAS  Google Scholar 

  17. Liu Z, He H, Yan S, Wang Y, Yang T, Li GZ. End-to-end models to imitate traditional Chinese medicine syndrome differentiation in lung cancer diagnosis: model development and validation. JMIR Med Inform 2020;8:e17821.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Pan Y, Chen Q, Peng W, Wang X, Hu B, Liu X, et al. Medwriter: Knowledge-aware medical text generation. Proceedings of the 28th International Conference on Computational Linguistics; December 2020; Barcelona, Spain (Online): International Committee on Computational Linguistics; 2020.

    Google Scholar 

  19. Afzal M, Alam F, Malik KM, Malik GM. Clinical context-aware biomedical text summarization using deep neural network: model development and validation. J Med Int Res 2020;22:e19810.

    Google Scholar 

  20. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Advances in Neural Information Processing Systems 30 (NIPS 2017); December 2017; Long Beach, CA, USA: NeurIPS Proceedings; 2017.

    Google Scholar 

  21. Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I. Language models are unsupervised multitask learners. OpenAI Blog 2019;1:1–24.

    Google Scholar 

  22. Fan A, Grave E, Joulin A. Reducing transformer depth on demand with structured dropout. International Conference on Learning Representations 2015 (ICLR 2020); April 2019; Vancouver, Canada (Online): Computational and Biological Learning Society; 2019.

    Google Scholar 

  23. Dai Z, Yang Z, Yang Y, Carbonell JG, Le Q, Salakhutdinov R. Transformer-XL: Attentive language models beyond a fixed-length context. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics; July 2019; Florence, Italy: Association for Computational Linguistics; 2019.

    Google Scholar 

  24. Yang M, Wang J. Adaptability of Financial Time Series prediction based on BILSTM. Procedia Comput Sci 2022;199:18–25.

    Article  Google Scholar 

  25. Freitag M, Al-Onaizan Y. Beam search strategies for neural machine translation. Proceedings of the First Workshop on Neural Machine Translation; August 2017; Vancouver, Canada: Association for Computational Linguistics; 2017.

    Book  Google Scholar 

  26. Puth MT, Neuhäuser M, Ruxton GD. Effective use of Pearson’s product-moment correlation coefficien. Anim Behav 2014;93:183–189.

    Article  Google Scholar 

  27. Papineni K, Roukos S, Ward T, Zhu WJ. BLEU: a method for automatic evaluation of machine translation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics; July 2002; Philadelphia, Pennsylvania, USA: Association for Computational Linguistics; 2020.

    Google Scholar 

  28. Radev D, Hovy E, McKeown K. Introduction to the special issue on summarization. Comput Linguist 2002;28:399–408.

    Article  Google Scholar 

  29. Marzal A, Vidal E. Computation of normalized edit distance and applications. IEEE Trans Pattern Anal Mach Intell 1993;15:926–932.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Yang T and Hu KF proposed the thought and framework of this paper. Wang XY and Yang T implemented the thought, analyzed the data and wrote this paper. Gao XY preprocessed the raw data and dedicated in experiment results analysis and manuscript revision of this paper. Yang T and Wang XY contributed equally to this work and should be regarded as co-first authors. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Kong-fa Hu.

Ethics declarations

The authors declare that they have no competing interests.

Additional information

Supported by the National Natural Science Foundation of China (No. 82174276 and 82074580), the Key Research and Development Program of Jiangsu Province (No. BE2022712), China Postdoctoral Foundation (No. 2021M701674), Postdoctoral Research Program of Jiangsu Province (No. 2021K457C), and Qinglan Project of Jiangsu Universities 2021

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Xy., Yang, T., Gao, Xy. et al. Knowledge Graph Enhanced Transformers for Diagnosis Generation of Chinese Medicine. Chin. J. Integr. Med. 30, 267–276 (2024). https://doi.org/10.1007/s11655-023-3612-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11655-023-3612-5

Keywords

Navigation