Medical Diagnosis by Using Machine Learning Techniques
Chapter
First Online:
Abstract
There are many challenges in data analytic research for TCM (Traditional Chinese Medicine), like various clinical record sources, different symptom descriptions, lots of collected clinical symptoms, more than one syndrome attached to one clinical record and etc. Novel methods on feature selection, multi-class, and multi-label techniques in machine learning field are proposed to meet the challenges. Here in this chapter, we will introduce our works on discriminative symptoms selection and multi-syndrome learning, which have improved the performance of state-of-arts works.
Keywords
Feature Selection Simulated Annealing Feature Subset Average Precision Binary Classifier
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
- S. Bernardini, S. Bertolini, A. Pastore, C. Cortese, C. Motti, R. Massoud, G. Federici, Homocysteine levels are highly predictive of CHD complications in subjects with familial hypercholesterolemia. Clin. Chem. Lab. Med. 255 (1999)Google Scholar
- T. Blickle, L. Thiele, A comparison of selection schemes used in evolutionary algorithms. Evol. Comput. 4(4), 361–394 (1996)CrossRefGoogle Scholar
- C. Cortes, V. Vapnik, Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)MATHGoogle Scholar
- T.T. Deng, Diagnostics of TCM (Shanghai Scientific and Technology Press, Shanghai, 1984)Google Scholar
- T.T. Deng, Practical TCM Diagnostics (People’s Medical Publishing House, Beijing, 2004)Google Scholar
- K. Duan, S.S. Keerthi, Which is the best multi-class SVM method? An empirical study, in Proceedings of the Sixth International Workshop on Multiple Classifier Systems (2005), pp. 278–285Google Scholar
- A. Elisseeff, J. Weston, A kernel method for multi-labelled classification. Adv. Neural Info. Process. Syst. 14, 681–687 (2002)Google Scholar
- I.A. Gheyas, L.S. Smith, Feature subset selection in large dimensionality domains. Pattern Recognit. 43(1), 5–13 (2010)CrossRefMATHGoogle Scholar
- L. Guo-Ping, L. Guo-Zheng, W. Ya-Lei, W. Yi-Qin, Modelling of inquiry diagnosis for coronary heart disease in traditional Chinese medicine by using multi-label learning, BMC Complementary and Alternative Medicine, 10, 37 (2010)Google Scholar
- I. Guyon, A. Elisseeff, An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATHGoogle Scholar
- T. Hastie, R. Tibshirani, Classification by pairwise coupling. Ann. Stat. 26(2), 451–471 (1998)CrossRefMATHMathSciNetGoogle Scholar
- H. He, E.A. Garcia, Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRefGoogle Scholar
- I.S. Helland, PLS regression and statistical models. Scand. J. Stat. 17, 97–114 (1990)MATHMathSciNetGoogle Scholar
- X.H. Hu, D. Wu, Data mining and predictive modeling of biomolecular network from biomedical literature databases. IEEE/ACM Trans. Comput. Biol. Bioinform. 4, 251–263 (2007)CrossRefGoogle Scholar
- S.W. Ji, J.P. Ye, Linear dimensionality reduction for multi-label classification, in Proceedings of the 21st International Conference on Artificial Intelligence, Pasadena (2009), pp. 1077–1082Google Scholar
- I.T. Jolliffe, Principal Component Analysis (Springer, New York, 1986)CrossRefGoogle Scholar
- D. Kerstin, N. Wolfgang, How valuable is medical social media data? Content analysis of the medical web. Inform. Sci. 179, 1870–1880 (2009)CrossRefGoogle Scholar
- G. Lei, L. Guo-Zheng, Y. Ming-Yu, Embedded feature selection for multi-label learning. J. Nanjing Univ. (Nat. Sci.) 45(5), 671–676 (2009) (in Chinese)Google Scholar
- G.-Z. Li, H.-L. Bu, M.Q. Yang, X.-Q. Zeng, J.Y. Yang, Selecting subsets of newly extracted features from PCA and PLS in microarray data analysis. BMC Genomics 9(S2), S24 (2008)CrossRefGoogle Scholar
- H.T. Lin, C.J. Lin, R.C. Weng, A note on Platt’s probabilistic outputs for support vector machines. Mach. Learn. 68(3), 267–276 (2007)CrossRefGoogle Scholar
- G.P. Liu, G.Z. Li, Y.L. Wang, Y.Q. Wang, Modeling of inquiry diagnosis for coronary heart disease in traditional Chinese medicine by using multi-label learning. BMC Complement. Altern. Med. 10, 4–37 (2010)CrossRefGoogle Scholar
- X.M. Lu, Z.L. Xiong, J.J. Li, S.N. Zheng, T.G. Huo, F.M. Li, Metabonomic study on ‘Kidney-Yang Deficiency syndrome’ and intervention effects of Rhizoma Drynariae extracts in rats using ultra performance liquid chromatography coupled with mass spectrometry. Talanta 15, 700–708 (2011)CrossRefGoogle Scholar
- J. Moody, J. Utans, Principled architecture selection for neural networks: application to corporate bond rating prediction, in Neural Information Processing Systems 4, ed. by J.E. Moody, S.J. Hanson, R.P. Lippmann (Morgan Kauffmann, San Mateo CA, USA, 1992), pp. 683–690Google Scholar
- T. Motoki, Calculating the expected loss of diversity of selection schemes. Evol. Comput. 10(4), 397–422 (2002)CrossRefGoogle Scholar
- H.C. Peng, F. Long, C. Ding, Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 8, 1226–1238 (2005)CrossRefGoogle Scholar
- J.C. Platt, Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. in Advances in large margin classifiers (MIT Press, Cambridge, MA, USA 1999), pp. 61–74Google Scholar
- J.C. Platt, N. Cristianini, J. Shawe-Taylor, Large margin DAGs for multi-class classification, in Proceedings of Neural Information Processing Systems, NIPS'99 (Denver, CO, USA, 2000), pp. 547–553Google Scholar
- P. Pudil, J. Novovicov, J. Kittler et al., Floating search methods in feature selection. Pattern Recognit. Lett. 15(11), 1119–1125 (1994)CrossRefGoogle Scholar
- H.N. Qu, G.Z. Li, W.S. Xu, An asymmetric classifier based on partial least squares. Pattern Recognit. 43(10), 3448–3457 (2010). ElsevierCrossRefMATHGoogle Scholar
- J.R. Quinlan, C4.5: Programs for Machine Learning (Morgan Kaufmann, San Mateo, 1993)Google Scholar
- M. Ronen, Z. Jacob, Using simulated annealing to optimize feature selection problem in marketing applications. Eur. J. Oper. Res. 171, 842–858 (2006)CrossRefMATHGoogle Scholar
- A. Ross, A. Jain, Information fusion in biometrics. Pattern Recognit. Lett. 24, 2115–2125 (2003)CrossRefGoogle Scholar
- A. Ross, R. Govindarajan, Feature level fusion using hand and face biometrics, in Proceedings of SPIE Conference on Biometric Technology for Human Identification II, (Orlando, USA, 2005), pp. 196–204Google Scholar
- R.E. Schapire, Y. Singer, Improved boosting algorithms using confidence-rated predictions. Mach. Learn. 37(3), 297–336 (1999)CrossRefMATHGoogle Scholar
- H. Shao, G.Z. Li, G.P. Liu, Y. Wang, Symptom selection for multi-label data of inquiry diagnosis in traditional Chinese medicine. Sci. China Info. Sci. 56, 052118(13) (2011) (DOI: 10.1007/s11432-011-4406-5)Google Scholar
- A. Sokolov, D. Whitley, Unbiased tournament election, in Proceedings of the 2005 Conference on Genetic and Evolutionary Computation (ACM, Washington, DC, 2005), pp. 1131–1138Google Scholar
- G. Tsoumakas, I. Katakis, I. Vlahavas, Mining multi-label data, in Data Mining and Knowledge Discovery Handbook, ed. by O. Maimon, L. Rokach (Springer, Boston, 2009), pp. 667–685CrossRefGoogle Scholar
- Y.Q. Wang, Diagnostics of TCM (Chinese Medicine Science and Technology Press, Beijing, 2004)Google Scholar
- Y. Wang, Progress and prospect of objectivity study on four diagnostic methods in traditional Chinese medicine, in Bioinformatics and Biomedicine Workshops (BIBMW), 2010 IEEE International Conference on (Hongkong, China, 2010)Google Scholar
- J. Wang, Q.Y. He, K.W. Yao, W. Rong, Y.W. Xing, Z. Yue, Support vector machine (SVM) and traditional Chinese medicine: syndrome factors based an SVM from coronary heart disease treated by prominent traditional Chinese medicine doctors, in Fifth International Conference on Natural Computation: 14–16 August 2009; Tianjian, ed. by H.Y. Wang, K.S. Low, K.X. Wei, J.Q. Sun (IEEE Computer Society, Los Alamitos, 2009a), pp. 176–180CrossRefGoogle Scholar
- Y.Q. Wang, Z.X. Xu, F.F. Li, H.X. Yan, Research ideas and methods about objectification of the four diagnostic methods of traditional Chinese medicine. Acta Universitatis Traditionis Medicalis Sinensis Pharmacologiaeque Shanghai 23, 4–8 (2009b)Google Scholar
- H. Wold, Path models with latent variables: the NIPALS approach, in Quantitative Sociology: International Perspectives on Mathematical and Statistical Model Building (Academic, New York, 1975), pp. 307–357CrossRefGoogle Scholar
- J. Yang, V. Honavar, Feature subset selection using a genetic algorithm. IEEE Intell. Syst. Appl. 13, 44–49 (1998)CrossRefGoogle Scholar
- M.Y. You, G.Z. Li, X.Q. Zeng, L. Ge, L. Bi, S. Huang, J.Y. Yang, M.Q. Yang, A personalized traditional Chinese medicine system in the case of Cai’s gynecology. Int. J. Funct. Inform. Personal. Med. 1(4), 419–438 (2008). InderscienceCrossRefGoogle Scholar
- K. Yu, S.P. Yu, V. Tresp, Multi-label informed latent semantic indexing, in Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, New York, 2005, pp. 258–265Google Scholar
- Z.K. Yuan, X.P. Huang, F.Y. Fan, Analysis of the tongue micro-indexes of qi-blood patterns of heart disorders. J. Tradit. Chin. Med. Univ. Hunan (1998-04)Google Scholar
- X.-Q. Zeng, G.-Z. Li, G.-F. Wu, J.Y. Yang, M.Q. Yang, Irrelevant gene elimination for partial least squares based dimension reduction by using feature probes. Int. J. Data Min. Bioinform. 3(1), 85–103 (2009). InderscienceCrossRefGoogle Scholar
- M. Zhang, MLA 2010. Grigorios Tsoumakas, Ioannis Katakis. Multi-Label Classification: An Overview. International Journal of Data Warehousing & Mining, 3(3), 1–13, July–September 2007Google Scholar
- M.L. Zhang, Z.H. Zhou, Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 18(10), 1338–1351 (2006)CrossRefGoogle Scholar
- M.L. Zhang, Z.H. Zhou, ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007)CrossRefMATHGoogle Scholar
- M.L. Zhang, J.M. Pena, V. Robles et al., Feature selection for multi-label naive Bayes classification. Inform. Sci. 179(19), 3218–3229 (2009)CrossRefMATHGoogle Scholar
- Y. Zhang, Z.-H. Zhou. Multi-label dimensionality reduction via dependency maximization. ACM Transactions on Knowledge Discovery from Data (ACM TKDD), 4(3), Article 14 (2010)Google Scholar
- Z.H. Zhou, X.Y. Liu, Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans. Knowl. Data Eng. 18(1), 63–77 (2006)CrossRefGoogle Scholar
- X. Zhou, S. Chen, B. Liu, R. Zhang, Y. Wang, P. Li, Y. Guo, Development of traditional Chinese medicine clinical data warehouse for medical knowledge discovery and decision support. Artif. Intell. Med. 48, 139–152 (2010)CrossRefGoogle Scholar
Copyright information
© Springer International Publishing Switzerland 2014