Skip to main content

Medical Diagnosis by Using Machine Learning Techniques

  • Chapter
  • First Online:
Data Analytics for Traditional Chinese Medicine Research

Abstract

There are many challenges in data analytic research for TCM (Traditional Chinese Medicine), like various clinical record sources, different symptom descriptions, lots of collected clinical symptoms, more than one syndrome attached to one clinical record and etc. Novel methods on feature selection, multi-class, and multi-label techniques in machine learning field are proposed to meet the challenges. Here in this chapter, we will introduce our works on discriminative symptoms selection and multi-syndrome learning, which have improved the performance of state-of-arts works.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • S. Bernardini, S. Bertolini, A. Pastore, C. Cortese, C. Motti, R. Massoud, G. Federici, Homocysteine levels are highly predictive of CHD complications in subjects with familial hypercholesterolemia. Clin. Chem. Lab. Med. 255 (1999)

    Google Scholar 

  • T. Blickle, L. Thiele, A comparison of selection schemes used in evolutionary algorithms. Evol. Comput. 4(4), 361–394 (1996)

    Article  Google Scholar 

  • C. Cortes, V. Vapnik, Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  • T.T. Deng, Diagnostics of TCM (Shanghai Scientific and Technology Press, Shanghai, 1984)

    Google Scholar 

  • T.T. Deng, Practical TCM Diagnostics (People’s Medical Publishing House, Beijing, 2004)

    Google Scholar 

  • K. Duan, S.S. Keerthi, Which is the best multi-class SVM method? An empirical study, in Proceedings of the Sixth International Workshop on Multiple Classifier Systems (2005), pp. 278–285

    Google Scholar 

  • A. Elisseeff, J. Weston, A kernel method for multi-labelled classification. Adv. Neural Info. Process. Syst. 14, 681–687 (2002)

    Google Scholar 

  • I.A. Gheyas, L.S. Smith, Feature subset selection in large dimensionality domains. Pattern Recognit. 43(1), 5–13 (2010)

    Article  MATH  Google Scholar 

  • L. Guo-Ping, L. Guo-Zheng, W. Ya-Lei, W. Yi-Qin, Modelling of inquiry diagnosis for coronary heart disease in traditional Chinese medicine by using multi-label learning, BMC Complementary and Alternative Medicine, 10, 37 (2010)

    Google Scholar 

  • I. Guyon, A. Elisseeff, An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  • T. Hastie, R. Tibshirani, Classification by pairwise coupling. Ann. Stat. 26(2), 451–471 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  • H. He, E.A. Garcia, Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  • I.S. Helland, PLS regression and statistical models. Scand. J. Stat. 17, 97–114 (1990)

    MATH  MathSciNet  Google Scholar 

  • X.H. Hu, D. Wu, Data mining and predictive modeling of biomolecular network from biomedical literature databases. IEEE/ACM Trans. Comput. Biol. Bioinform. 4, 251–263 (2007)

    Article  Google Scholar 

  • S.W. Ji, J.P. Ye, Linear dimensionality reduction for multi-label classification, in Proceedings of the 21st International Conference on Artificial Intelligence, Pasadena (2009), pp. 1077–1082

    Google Scholar 

  • I.T. Jolliffe, Principal Component Analysis (Springer, New York, 1986)

    Book  Google Scholar 

  • D. Kerstin, N. Wolfgang, How valuable is medical social media data? Content analysis of the medical web. Inform. Sci. 179, 1870–1880 (2009)

    Article  Google Scholar 

  • G. Lei, L. Guo-Zheng, Y. Ming-Yu, Embedded feature selection for multi-label learning. J. Nanjing Univ. (Nat. Sci.) 45(5), 671–676 (2009) (in Chinese)

    Google Scholar 

  • G.-Z. Li, H.-L. Bu, M.Q. Yang, X.-Q. Zeng, J.Y. Yang, Selecting subsets of newly extracted features from PCA and PLS in microarray data analysis. BMC Genomics 9(S2), S24 (2008)

    Article  Google Scholar 

  • H.T. Lin, C.J. Lin, R.C. Weng, A note on Platt’s probabilistic outputs for support vector machines. Mach. Learn. 68(3), 267–276 (2007)

    Article  Google Scholar 

  • G.P. Liu, G.Z. Li, Y.L. Wang, Y.Q. Wang, Modeling of inquiry diagnosis for coronary heart disease in traditional Chinese medicine by using multi-label learning. BMC Complement. Altern. Med. 10, 4–37 (2010)

    Article  Google Scholar 

  • X.M. Lu, Z.L. Xiong, J.J. Li, S.N. Zheng, T.G. Huo, F.M. Li, Metabonomic study on ‘Kidney-Yang Deficiency syndrome’ and intervention effects of Rhizoma Drynariae extracts in rats using ultra performance liquid chromatography coupled with mass spectrometry. Talanta 15, 700–708 (2011)

    Article  Google Scholar 

  • J. Moody, J. Utans, Principled architecture selection for neural networks: application to corporate bond rating prediction, in Neural Information Processing Systems 4, ed. by J.E. Moody, S.J. Hanson, R.P. Lippmann (Morgan Kauffmann, San Mateo CA, USA, 1992), pp. 683–690

    Google Scholar 

  • T. Motoki, Calculating the expected loss of diversity of selection schemes. Evol. Comput. 10(4), 397–422 (2002)

    Article  Google Scholar 

  • H.C. Peng, F. Long, C. Ding, Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 8, 1226–1238 (2005)

    Article  Google Scholar 

  • J.C. Platt, Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. in Advances in large margin classifiers (MIT Press, Cambridge, MA, USA 1999), pp. 61–74

    Google Scholar 

  • J.C. Platt, N. Cristianini, J. Shawe-Taylor, Large margin DAGs for multi-class classification, in Proceedings of Neural Information Processing Systems, NIPS'99 (Denver, CO, USA, 2000), pp. 547–553

    Google Scholar 

  • P. Pudil, J. Novovicov, J. Kittler et al., Floating search methods in feature selection. Pattern Recognit. Lett. 15(11), 1119–1125 (1994)

    Article  Google Scholar 

  • H.N. Qu, G.Z. Li, W.S. Xu, An asymmetric classifier based on partial least squares. Pattern Recognit. 43(10), 3448–3457 (2010). Elsevier

    Article  MATH  Google Scholar 

  • J.R. Quinlan, C4.5: Programs for Machine Learning (Morgan Kaufmann, San Mateo, 1993)

    Google Scholar 

  • M. Ronen, Z. Jacob, Using simulated annealing to optimize feature selection problem in marketing applications. Eur. J. Oper. Res. 171, 842–858 (2006)

    Article  MATH  Google Scholar 

  • A. Ross, A. Jain, Information fusion in biometrics. Pattern Recognit. Lett. 24, 2115–2125 (2003)

    Article  Google Scholar 

  • A. Ross, R. Govindarajan, Feature level fusion using hand and face biometrics, in Proceedings of SPIE Conference on Biometric Technology for Human Identification II, (Orlando, USA, 2005), pp. 196–204

    Google Scholar 

  • R.E. Schapire, Y. Singer, Improved boosting algorithms using confidence-rated predictions. Mach. Learn. 37(3), 297–336 (1999)

    Article  MATH  Google Scholar 

  • H. Shao, G.Z. Li, G.P. Liu, Y. Wang, Symptom selection for multi-label data of inquiry diagnosis in traditional Chinese medicine. Sci. China Info. Sci. 56, 052118(13) (2011) (DOI: 10.1007/s11432-011-4406-5)

    Google Scholar 

  • A. Sokolov, D. Whitley, Unbiased tournament election, in Proceedings of the 2005 Conference on Genetic and Evolutionary Computation (ACM, Washington, DC, 2005), pp. 1131–1138

    Google Scholar 

  • G. Tsoumakas, I. Katakis, I. Vlahavas, Mining multi-label data, in Data Mining and Knowledge Discovery Handbook, ed. by O. Maimon, L. Rokach (Springer, Boston, 2009), pp. 667–685

    Chapter  Google Scholar 

  • Y.Q. Wang, Diagnostics of TCM (Chinese Medicine Science and Technology Press, Beijing, 2004)

    Google Scholar 

  • Y. Wang, Progress and prospect of objectivity study on four diagnostic methods in traditional Chinese medicine, in Bioinformatics and Biomedicine Workshops (BIBMW), 2010 IEEE International Conference on (Hongkong, China, 2010)

    Google Scholar 

  • J. Wang, Q.Y. He, K.W. Yao, W. Rong, Y.W. Xing, Z. Yue, Support vector machine (SVM) and traditional Chinese medicine: syndrome factors based an SVM from coronary heart disease treated by prominent traditional Chinese medicine doctors, in Fifth International Conference on Natural Computation: 14–16 August 2009; Tianjian, ed. by H.Y. Wang, K.S. Low, K.X. Wei, J.Q. Sun (IEEE Computer Society, Los Alamitos, 2009a), pp. 176–180

    Chapter  Google Scholar 

  • Y.Q. Wang, Z.X. Xu, F.F. Li, H.X. Yan, Research ideas and methods about objectification of the four diagnostic methods of traditional Chinese medicine. Acta Universitatis Traditionis Medicalis Sinensis Pharmacologiaeque Shanghai 23, 4–8 (2009b)

    Google Scholar 

  • H. Wold, Path models with latent variables: the NIPALS approach, in Quantitative Sociology: International Perspectives on Mathematical and Statistical Model Building (Academic, New York, 1975), pp. 307–357

    Chapter  Google Scholar 

  • J. Yang, V. Honavar, Feature subset selection using a genetic algorithm. IEEE Intell. Syst. Appl. 13, 44–49 (1998)

    Article  Google Scholar 

  • M.Y. You, G.Z. Li, X.Q. Zeng, L. Ge, L. Bi, S. Huang, J.Y. Yang, M.Q. Yang, A personalized traditional Chinese medicine system in the case of Cai’s gynecology. Int. J. Funct. Inform. Personal. Med. 1(4), 419–438 (2008). Inderscience

    Article  Google Scholar 

  • K. Yu, S.P. Yu, V. Tresp, Multi-label informed latent semantic indexing, in Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, New York, 2005, pp. 258–265

    Google Scholar 

  • Z.K. Yuan, X.P. Huang, F.Y. Fan, Analysis of the tongue micro-indexes of qi-blood patterns of heart disorders. J. Tradit. Chin. Med. Univ. Hunan (1998-04)

    Google Scholar 

  • X.-Q. Zeng, G.-Z. Li, G.-F. Wu, J.Y. Yang, M.Q. Yang, Irrelevant gene elimination for partial least squares based dimension reduction by using feature probes. Int. J. Data Min. Bioinform. 3(1), 85–103 (2009). Inderscience

    Article  Google Scholar 

  • M. Zhang, MLA 2010. Grigorios Tsoumakas, Ioannis Katakis. Multi-Label Classification: An Overview. International Journal of Data Warehousing & Mining, 3(3), 1–13, July–September 2007

    Google Scholar 

  • M.L. Zhang, Z.H. Zhou, Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 18(10), 1338–1351 (2006)

    Article  Google Scholar 

  • M.L. Zhang, Z.H. Zhou, ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007)

    Article  MATH  Google Scholar 

  • M.L. Zhang, J.M. Pena, V. Robles et al., Feature selection for multi-label naive Bayes classification. Inform. Sci. 179(19), 3218–3229 (2009)

    Article  MATH  Google Scholar 

  • Y. Zhang, Z.-H. Zhou. Multi-label dimensionality reduction via dependency maximization. ACM Transactions on Knowledge Discovery from Data (ACM TKDD), 4(3), Article 14 (2010)

    Google Scholar 

  • Z.H. Zhou, X.Y. Liu, Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans. Knowl. Data Eng. 18(1), 63–77 (2006)

    Article  Google Scholar 

  • X. Zhou, S. Chen, B. Liu, R. Zhang, Y. Wang, P. Li, Y. Guo, Development of traditional Chinese medicine clinical data warehouse for medical knowledge discovery and decision support. Artif. Intell. Med. 48, 139–152 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guo-Zheng Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

You, M., Li, GZ. (2014). Medical Diagnosis by Using Machine Learning Techniques. In: Poon, J., K. Poon, S. (eds) Data Analytics for Traditional Chinese Medicine Research. Springer, Cham. https://doi.org/10.1007/978-3-319-03801-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-03801-8_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-03800-1

  • Online ISBN: 978-3-319-03801-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics