Skip to main content

Advertisement

Log in

Topic model for Chinese medicine diagnosis and prescription regularities analysis: Case on diabetes

  • Thinking and Methodology
  • Published:
Chinese Journal of Integrative Medicine Aims and scope Submit manuscript

Abstract

Induction of common knowledge or regularities from large-scale clinical data is a vital task for Chinese medicine (CM). In this paper, we propose a data mining method, called the Symptom-Herb-Diagnosis topic (SHDT) model, to automatically extract the common relationships among symptoms, herb combinations and diagnoses from large-scale CM clinical data. The SHDT model is one of the multi-relational extensions of the latent topic model, which can acquire topic structure from discrete corpora (such as document collection) by capturing the semantic relations among words. We applied the SHDT model to discover the common CM diagnosis and treatment knowledge for type 2 diabetes mellitus (T2DM) using 3 238 inpatient cases. We obtained meaningful diagnosis and treatment topics (clusters) from the data, which clinically indicated some important medical groups corresponding to comorbidity diseases (e.g., heart disease and diabetic kidney diseases in T2DM inpatients). The results show that manifestation sub-categories actually exist in T2DM patients that need specific, individualised CM therapies. Furthermore, the results demonstrate that this method is helpful for generating CM clinical guidelines for T2DM based on structured collected clinical data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Feng Y, Wu ZH, Zhou XZ, Zhou ZM, Fan WY. Knowledge discovery in traditional Chinese medicine: State of the art and perspectives. Artif Intell Med 2006;38:219–236.

    Article  PubMed  Google Scholar 

  2. Lukman S, He Y, Hui SC. Computational methods for traditional Chinese medicine: A survey. Comput Methods Programs Biomed 2007;88:283–294.

    Article  PubMed  Google Scholar 

  3. Zhou XZ, Liu BY, Wang YH, Zhang RS, Li P, Chen SB, et al. Building clinical data warehouse for traditional Chinese medicine knowledge discovery. In: Proceedings of the 2008 International Conference on BioMedical Engineering and informatics. Sanya, Hainan, China; 2008:31–36.

  4. Mitchell TM. Machine learning and data mining. Commun ACM 1999;42 (11):31–36.

    Google Scholar 

  5. Zhou XZ, Liu BY, Wu ZH, Feng Y. Integrative mining of traditional Chinese medicine literature and MEDLINE for functional gene networks. Artif Intell Med 2007; 41:87–104.

    Article  PubMed  Google Scholar 

  6. Zhou XZ, Liu BY. Traditional Chinese medicine clinical data mining: experiences and issues. In: AIBDM workshop of 13th PAKDD. Bangkok, Thailand; 2009:11–20.

  7. Steyvers M, Griffiths T. Probabilistic topic models. In: Landauer T, McNamara D, Dennis S, Kintsch W, eds. The handbook of latent semantic analysis: a road to meaning. Hillsdale, New Jersey: Erlbaum; 2007:427–448.

  8. Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P. The author-topic model for authors and documents. In: Proceedings of the 20th conference on uncertainty in artificial intelligence. Virginia: AUAI Press; 2004:487–494.

    Google Scholar 

  9. Zhou XZ, Chen SB, Liu BY, Zhang RS, Wang YH, Li P, et al. Development of traditional Chinese medicine clinical data warehouse for medical knowledge discovery and decision support. Artif Intell Med 2010;48:139–152.

    Article  PubMed  Google Scholar 

  10. Blei DM, Ng AY. Jordan MI. Latent Dirichlet allocation. J Machine Learn Res 2003;3:993–1022.

    Article  Google Scholar 

  11. McCallum A, Wang XR, Corrada-Emmanuel A. Topic and role discovery in social networks with experiments on enron and academic email. J Artif Intell Res 2007;30:249–272.

    Google Scholar 

  12. Wang XR, Mohanty N, McCallum A. Group and topic discovery from relations and their attributes. In: Weiss Y, Schölkopf B, and Platt J, eds. Advances in Neural Information Processing Systems 18. Cambridge, MA: MIT Press; 2006:1449–1456.

    Google Scholar 

  13. Erosheva E, Fienberg S, Lafferty J. Mixed-membership models of scientific publications. Proc Natl Acad Sci U S A 2004;101(Suppl 1):5220–5227.

    Article  PubMed  CAS  Google Scholar 

  14. Mimno D, McCallum A. Expertise modeling for matching papers with reviewers. In: 13th ACM SIGKDD international conference on knowledge discovery and data mining. California, USA; 2007:500–509.

  15. Zhang XP, Zhou XZ, Huang HK, Feng Q, Chen SB. Multi-relational topic model for cm clinical knowledge discovery. In: AIBDM workshop of 13th PAKDD. Bangkok, Thailand; 2009:31–39.

  16. Gao Z, Po L, Jiang W, Zhao X, Dong H. A novel computerized method based on support vector machine for tongue diagnosis. In: Proceedings of the 3th International IEEE Conference on signal-image technologies and internet-based system.iShanghai, China; 2007:849–854.

  17. Zhang Q, Zhang WT, Wei JJ, Wang XB, Liu P. Combined use of factor analysis and cluster analysis in classification of traditional Chinese medical syndromes in patients with posthepatitic cirrhosis. J Chin Integr Med 2005;3:14–18.

    Article  CAS  Google Scholar 

  18. Qin ZG, Mao ZY, Deng ZZ. The application of rough set in the Chinese medicine rheumatic arthritis diagnosis. Chin J Biomed Eng (Chin) 2001;20:357–363.

    Google Scholar 

  19. Wang XW, Qu HB, Liu P, Cheng YY. A self-learning expert system for diagnosis in traditional Chinese medicine. Expert Syst Appl 2004;26: 557–566.

    Article  Google Scholar 

  20. Deerwester SC, Dumais ST, Landauer TK, Furnas GW, Harshman RA. Indexing by latent semantic analysis. J Am Soc Inf Sci 1990;41:391–407.

    Article  Google Scholar 

  21. Hofmann T. Unsupervised learning by probabilistic latent semantic analysis. Machine Learn J 2001;42:177–196.

    Article  Google Scholar 

  22. American Diabetes Association. Report of the expert committee on the diagnosis and classification of diabetes mellitus. Diabetes Care 1997;25:s5–s20.

    Google Scholar 

  23. Zhou XZ, Peng YH, Liu BY. Text mining for traditional Chinese medical knowledge discovery: a survey. J Biomed Inf 2010;43:650–660.

    Article  Google Scholar 

  24. Pang B, Zhang D, Li N, Wang K. Computerized tongue diagnosis based on Bayesian networks. IEEE Trans Biomed Eng 2004;51:1803–1810.

    Article  PubMed  Google Scholar 

  25. Zhang NL, Yuan SH, Chen T, Wang Y. Latent tree models and diagnosis in traditional Chinese Medicine. Artif Intell Med 2008;42:229–245.

    PubMed  Google Scholar 

  26. Wu ZH, Zhou XZ, Liu BY, Cheng J. Text mining for finding functional community of related genes using CM knowledge. In: Boulicaut JF, Esposito F, Giannotti F, et al, eds. Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases. Berlin: Springer-Verlag; 2004:459–470.

    Google Scholar 

  27. Li C, Tang CJ, Peng J, Hu JJ. NNF: An effective approach in medicine paring analysis of traditional Chinese medicine prescriptions. In: Zhou L, Ooi B, Meng X, eds. Proceedings of DASFAA 2005, lecture notes in computer science 3453. Berlin: Springer-Verlag; 2005:576–581.

    Google Scholar 

  28. Yao MC, Ai L, Yuan YM, Qiao YJ. Analysis of the association rule in the composition of the cm formulas for diabetes. J Beijing Univ Chin Med (Chin) 2002;25 (6):48–50.

    Google Scholar 

  29. Zhou ZM, Wu ZH, Wang CS, Feng Y. Mining both associated and correlated patterns. In: Alexandrov VN, et al, eds. Proceedings of ICCS, lecture notes in computer science 3994. Berlin: Springer-Verlag; 2006:468–475.

    Google Scholar 

  30. Deng K, Liu DL, Gao S, Geng Z. Structural learning of graphical models and its applications to traditional Chinese medicine. In: Wang L, Jin Y, eds. Proceedings of FSKD, lecture notes in computer science 3614. Berlin: Springer-Verlag; 2005:362–367.

    Google Scholar 

  31. Zhou XZ, Liu BY, Wu ZH. Text mining for clinical Chinese herbal medical knowledge discovery. In: Hoffmann AG, Motoda H, Scheffer T, eds. Discovery science, lecture notes in computer science 3735. Berlin: Springer-Verlag; 2005:396–398.

    Google Scholar 

  32. Cao C, Wang H, Sui Y. Knowledge modeling and acquisition of traditional Chinese herbal drugs and formulae from text. Artifl Intell Med 2004;32:3–13.

    Article  Google Scholar 

  33. Zimmet P. Globalization, coca-colonization and the chronic disease epidemic: can the doomsday scenario be averted? J Int Med 2000;247:301–310.

    Article  CAS  Google Scholar 

  34. Covington MB. Traditional Chinese medicine in the treatment of diabetes. Diabetes Spectrum 2001;14: 154–159.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shi-bo Chen  (陈世波).

Additional information

Supported by Scientific Breakthrough Program of Beijing Municipal Science & Technology Commission, China (No. D08050703020803, No. D08050703020804); China Key Technologies R&D Programme (No. 2007BA110B06-01); Major State Basic Research Development Program of China (973 Program, No. 2006CB504601); National Nature Science Foundation of China (No. 90709006); National Science and Technology Major Project of the Ministry of Science and Technology of China (No. 2009ZX10005-019)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Xp., Zhou, Xz., Huang, Hk. et al. Topic model for Chinese medicine diagnosis and prescription regularities analysis: Case on diabetes. Chin. J. Integr. Med. 17, 307–313 (2011). https://doi.org/10.1007/s11655-011-0699-x

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11655-011-0699-x

Keywords

Navigation