Knowledge and Information Systems

, Volume 19, Issue 1, pp 79–105 | Cite as

Encoding and decoding the knowledge of association rules over SVM classification trees

Regular Paper

Abstract

This paper presents a constructive method for association rule extraction, where the knowledge of data is encoded into an SVM classification tree (SVMT), and linguistic association rule is extracted by decoding of the trained SVMT. The method of rule extraction over the SVMT (SVMT-rule), in the spirit of decision-tree rule extraction, achieves rule extraction not only from SVM, but also over the decision-tree structure of SVMT. Thus, the obtained rules from SVMT-rule have the better comprehensibility of decision-tree rule, meanwhile retains the good classification accuracy of SVM. Moreover, profiting from the super generalization ability of SVMT owing to the aggregation of a group of SVMs, the SVMT-rule is capable of performing a very robust classification on such datasets that have seriously, even overwhelmingly, class-imbalanced data distribution. Experiments with a Gaussian synthetic data, seven benchmark cancers diagnosis, and one application of cell-phone fraud detection have highlighted the utility of SVMT and SVMT-rule on comprehensible and effective knowledge discovery, as well as the superior properties of SVMT-rule as compared to a purely support-vector based rule extraction. (A version of SVMT Matlab software is available online at http://kcir.kedri.info)

Keywords

Association rule extraction Support vector machine SVM aggregating intelligence SVM ensemble SVM classification tree Class imbalance Class overlap 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Nunez H, Angulo C, Catala A (2002) Rule-extraction from support vector Machines. In: The European symposiumon aritificial neural networks, Burges, pp 107–112Google Scholar
  2. 2.
    Zhang Y, Su HY, Jia T, Chu J (2005) Rule extraction from trained support vector machines, PAKDD 2005, LANI3518. Springer, Heidelberg, pp 61–70Google Scholar
  3. 3.
    Wang L, Fu X (2005) Rule extraction from support vector machine. In: Data mining with computational intelligence, nced information and knowlegde processing. Springer, BerlinGoogle Scholar
  4. 4.
    Barakat N, Bradley AP (2006) Rule extraction from support vector machines: measuring the explanation capability using the area under the ROC curve. In: The 18th international conference on pattern recognition (ICPR’06), August, 2006, Hong KongGoogle Scholar
  5. 5.
    Fung G, Sandilya S, Rao B (2005) Rule extraction for linear support vector machines, KDD2005, August 21–24, 2005, ChicagoGoogle Scholar
  6. 6.
    Fu X, Ong C, Keerthi S, Huang GG, Goh L (2004) Extracting the knowledge embedded in support vector machines. In: Proceedings of IEEE international joint conference on neural networks, vol 1, no 25–29 July 2004, pp 291–296Google Scholar
  7. 7.
    Vapnik V (1982) Estimation of dependences based on empirical data. Springer, HeidelbergMATHGoogle Scholar
  8. 8.
    Vapnik V (1995) The nature of statistical learning theory. Spinger, HeidelbergMATHGoogle Scholar
  9. 9.
    Cortes C, Vapnik V (1995) Support vector network. Mach Learning 20: 273–297MATHGoogle Scholar
  10. 10.
    Pang S, Ozawa S, Kasabov N (2004) One-pass incremental membership authentication by face classification. ICBA 2004, LNCS, vol 3072. Springer, Heidelberg, pp 155–161Google Scholar
  11. 11.
    Pang S, Kim D, Bang SY (2003) Membership authentication in the dynamic group by face classification using SVM ensemble. Patt Recogn Lett 24: 215–225MATHCrossRefGoogle Scholar
  12. 12.
    Pang S (2005) SVM aggregation: SVM, SVM ensemble, SVM classification tree, IEEE SMC eNewsletter Dec. 2005. http://www.ieeesmc.org/Newsletter/Dec2005/R11Pang.php
  13. 13.
    Pang S, Kim D, Bang SY (2005) Face membership authentication using svm classification tree generated by membership-based LLE data partition. IEEE Trans Neural Netw 16(2): 436–446CrossRefGoogle Scholar
  14. 14.
    Schölkopf JC, Platt JC, Shawe-Taylor J, Smola AJ, Williamson RC (1999) Estimating the support of a high-dimensional distribution. Technical report, Microsoft Research, MSR-TR-99-87Google Scholar
  15. 15.
    Tax DMJ (2001) One-class classification, concept-learning in the absence of counter-examples. PhD ThesisGoogle Scholar
  16. 16.
    Tax DMJ, Duin RPW (2001) Combining one-class classifiers. LNCS 2096: 299–308MathSciNetGoogle Scholar
  17. 17.
    Xu Y, Brereton RG (2005) Diagnostic pattern recognition on gene expression profile data by using one-class classifiers. J Chem Inf Model 45: 1392–1401CrossRefGoogle Scholar
  18. 18.
    Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San MateoGoogle Scholar
  19. 19.
    Kim H-C, Pang S, Je H-M, Kim D, Yang Bang S (2003) Constructing support vector machine ensemble. Patt Recogn 36(12): 2757–2767MATHCrossRefGoogle Scholar
  20. 20.
    Shipp MA, Ross KN et al (2002) Supplementary information for diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med 8(1): 68–74CrossRefGoogle Scholar
  21. 21.
    Golub TR (2004) Toward a functional taxonomy of cancer. Cancer Cell 6(2): 107–8CrossRefMathSciNetGoogle Scholar
  22. 22.
    Pomeroy S, Tamayo P et al (2002) Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415(6870): 436–442CrossRefGoogle Scholar
  23. 23.
    Alon U, Barkai N et al (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 8: 6745–6750CrossRefGoogle Scholar
  24. 24.
    Petricoin EF, Ardekani AM et al (2002) Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359: 572–577CrossRefGoogle Scholar
  25. 25.
    Van’t Veer LJ et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415: 530–536CrossRefGoogle Scholar
  26. 26.
    Gordon GJ, Jensen R et al (2002) Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res 62: 4963–4967Google Scholar
  27. 27.
    Dudoit S, Yang YH, Callow MJ, Speed TP (2002) Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Stat Sin 12: 111–139MATHMathSciNetGoogle Scholar
  28. 28.
    Schuster A, Wolff R, Trock D (2005) A high-performance distributed algorithm for mining association rules. Knowl Inf Syst 7: 458–475CrossRefGoogle Scholar
  29. 29.
    Kam Ho T (1998) The random subspace method for constructing decision forests Tin Kam Ho. IEEE Trans Patt Anal Mach Intell 20(8): 832–844CrossRefGoogle Scholar
  30. 30.
    NeuCom—A Neuro-computing Decision Support Enviroment, Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, http://www.theneucom.com
  31. 31.
    Nez H, Angulo C, Catal A (2003) Hybrid Architecture based on support vector machines. In: Book Computational Methods in Neural Modeling Lecture Notes in Computer Science, vol 2686, pp 646–653Google Scholar
  32. 32.
    Zhou ZH, Jiang Y (2003) Medical diagnosis with C4.5 rule preceded by artificial neural netowrk ensemble. IEEE Trans Inf Technol Biomed 7(1): 37–42CrossRefMathSciNetGoogle Scholar
  33. 33.
    Chen Y, Wang JZ (2003) Support vector learning for fuzzy rule-based classification systems. IEEE Trans Fuzzy Syst 11(6): 716–728CrossRefGoogle Scholar
  34. 34.
    Nunez H, Angulo C, Catala A (2002) Support vector machines with symbolic interpretation. In: Proceedings of VII Brazilian symposium on neural networks, 11–14 Nov. 2002, pp 142–147Google Scholar
  35. 35.
    Duch W, Setiono R, Zurada JM (2004) Computational intelligence methods for rule-based data understanding. Proc IEEE 92(5): 771–805CrossRefGoogle Scholar
  36. 36.
    Pang S, Kim D, Bang SY (2001) Fraud detection using support vector machine ensemble. ICONIP2001, Shanghai, ChinaGoogle Scholar
  37. 37.
    Terabe M, Washio T, Motoda H, Katai O, Sawaragi T (2002) Attribute generation based on association rules. Knowl Inf Syst 4: 329–349CrossRefGoogle Scholar
  38. 38.
    Barakat N, Diederich J (2004) Learning-based rule-extraction from support vector machines: performance on benchmark data sets. In: Proceedings of conference on neuro-computing and evolving intelligence, Dec. 2004Google Scholar
  39. 39.
    Barakat N, Diederich J (2005) Eclectic rule-extraction from support vector machines. Int J Comput Intell 2(1): 59–62Google Scholar
  40. 40.
    Barakat N, Bradley A (2007) Rule extraction from support vector machines: a sequential covering apporoach. IEEE Trans Knowl Data Eng 19(6): 729–741CrossRefGoogle Scholar
  41. 41.
    Provost F (2000) Machine Learning from Imbalanced Data Sets 101. Working Notes AAAI’00 workshop learning from imbalanced data sets, pp 1–3Google Scholar
  42. 42.
    Wu G, Chang E (2005) KBA: Kernel boundary alignment considering imbalance data distribution. IEEE Trans Knowl Data Eng 17(6): 786–795CrossRefGoogle Scholar
  43. 43.
    Wu G, Chang E (2003) Adaptive feature-space conformal transformation for imbalanced data learning. In: Proceedings of 20th internatuional conference on machine learning, pp 816–823Google Scholar
  44. 44.
    Lin Y, Lee Y, Wahba G (2002) Support vector machines for classification in nonstandard situations. Mach Learn 46: 191–101MATHCrossRefGoogle Scholar
  45. 45.
    Veropoulos K, Campbell C, Cristianini N (1999) Controlling the sensitivity of support vector machine. In: Proceedings of international joint conference on artifical intelligence, pp 55–60Google Scholar
  46. 46.
    Estabrooks J, Japkowicz N (2004) A multiple resampling method for learning from imbalanced data sets. Comput Intell 20: 18–36CrossRefMathSciNetGoogle Scholar
  47. 47.
    Zhou ZH, Liu XY (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 19(1): 63–77CrossRefGoogle Scholar
  48. 48.
    Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res (JAIR) 16: 321–357MATHGoogle Scholar
  49. 49.
    Falco De, Della Cioppa A, Iazzetta A, Tarantino E (2005) An evolutionary approach for automatically extracting intelligible classification rules. Knowl Inf Syst 7: 179–201CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2008

Authors and Affiliations

  1. 1.Knowledge Engineering and Discovery Research InstituteAuckland University of TechnologyAucklandNew Zealand

Personalised recommendations